OpenAI unashamed that it needs copyrighted materials to train

What they're doing is straight up digital colonialism!

MajorLinux
MajorLinux - Editor-in-chief

More and more articles are coming out everyday about AI. You’ve got people using ChatGPT in court and generating non-existent court cases. Companies that ban AI art turn around to use it to promote products. But the biggest topic in AI is its use of copyrighted materials. And OpenAI says AI couldn’t exist without it.

OpenAI says it with its chest

OpenAI submitted written evidence submission (PDF) back in December 2023. In it, OpenAI stated to UK House of Lords Communication and Digital Select Committee it needs that material. For the company, it would be “impossible to train today’s leading AI models without using copyrighted materials.”

As OpenAI sees it, copyright at this moment “covers virtually every sort of human expression – including blog posts, photographs, forum posts, scraps of software code, and government documents.” It continued stating that by “[l]imiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.”

They believe this is all on the up-and-up stating this is all under the fair use doctrine according to a blog post in response to The New York Times’ lawsuit. However, they admit that they still have some work to do. They say there is “still work to be done to support and empower creators.” They continued by saying they let publishers block the GPTBot web crawler. OpenAI has also stated they are working on more mechanisms to allow rightsholders to opt out of training.

OpenAI shouldn’t have to be told “No!”

There is just so much wrong here with this. The fact that you think people needed this type of AI is just asinine. Nobody was asking for this. This didn’t need to exist. Given that, your need to consume a ton of copyrighted material, fair use or not, is moot. If you needed to test your tech, you should have stuck to the public domain for your “interesting experiment.”

I, as a rightsholder and website owner, should not have to take extra steps to make sure my work isn’t being exploited in this way or any way. People shouldn’t have to block your bot from scraping their site or go to your website to say stop. I do not understand how these companies can continue to exploit everyone else for their gain and expect them to be okay with it.

What does everything have to be opt out? Oh, I know the answer to this. It’s the easiest way for the capitalist class to exploit the working class even if they aren’t participating in the capitalist society. If you can’t capture and take credit for our work when we refuse to work with you, you just take it. It’s like you’ve all done with various cultures around the world.

This is nothing short of digital colonialism.

Source: Engadget

Share This Article
Editor-in-chief
Follow:
Marcus Summers is a Linux system administrator by trade. He has been working with Linux for nearly 15 years and has become a fan of open source ideals. He self identifies as a socialist and believes that the world's information should be free for all.
2 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *