OpenAI Pleads That It Can’t Make Money Without Using Copyrighted Materials for Free

Pika , 5 hours ago

I can already tell this is going to be a unpopular opinion judging by the comments but this is my ideology on it

it’s totally true. I’m indifferent on it, if it was acquired by a public facing source I don’t really care, but like im definitly against using data dumps or data that wasn’t available to the public in the first place. The whole thing with AI is rediculous, it’s the same as someone going to a website and making a mirror, or a reporter making an article that talks about what’s in it, last three web search based AI’s even gave sources for where it got the info. I don’t get the argument.

if it’s image based AI, well it’s the equivalent to an artist going to an art museum and deciding they want to replicate the art style seen in a painting. Maybe they shouldn’t be in a publishing field if they don’t want their work seen/used. That’s my ideology on it it’s not like the AI is taking a one-to-one copy and selling the artwork as , which in my opinion is a much more harmful instance and already happens commonly in today’s art world, it’s analyzing existing artwork which was available through the same means that everyone else had of going online loading up images and scraping the data. By this logic, artist should not be allowed to enter any art based websites museums or galleries, since by looking at others are they are able to adjust their own art which is stealing the author’s work. I’m not for or against it but, the ideology is insane to me.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

countablenewt , 5 hours ago

@Pika @flop_leash_973 This is largely my thoughts on the whole thing, the process of actually training the AI is no different from a human learning

The thing about that, is that there's likely enough precedent in copyright law to actually handle that, with most copyright law it's all about intent and scale and I think that's likely where this will all go

Here the intent is to replace and the scale is astronomical, whereas an individual's intent is to add and the scale is minimal

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Subverb , 5 hours ago (edited 4 hours ago)

The process of training the model is arguably similar to a human learning, and if the model just sat on a server doing nothing but knowing, there’d be no problem. Taking that knowledge and selling it to the public en mass is the issue.

This is precisely what copyrights and patents are here to safeguard. Is there already a book like A Song of Ice and Fire? Write something else, maybe better! There’s already a patent for an idea you have? Change and improve upon it and get your own patent!

You see, copyrights and patents are supposed to spur creativity, not hinder it. OpenAI should improve upon its system so that it actually thinks and is creative itself rather than regurgitating copyrighted materials, themes and ideas. Then they wouldn’t have this problem.

OpenAI wants literally all of human knowledge and creativity for free so that they can sell it back to you. And you’re okay-ish with it?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

countablenewt , 3 hours ago

@Subverb that is, quite impressively, the opposite of what I said

Is a person infringing on copyright by producing content? No. It’s about intent and scale. Humans don’t just sit on this knowledge, they do something with it

There is nothing illegal about WHAT it’s doing, there is everything illegal about HOW and WHY

I very clearly stated that OpenAI’s intent and their scale at which they operate are blatant copyright infringement and that it has been backed up with decades of precedents

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

zbyte64 , 3 hours ago

Hello fellow human. I also learn by having information shoveled to me without regard to my agency.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

countablenewt , 3 hours ago

@zbyte64 with everything you see you are scraping data from your environment whether you want to or not

How does a child learn what pain is? How does a teenager learn what heartbreak is? It’s certainly not because they made the decision to find that out themselves

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

zbyte64 , 2 hours ago

I bring up agency and I get an exemplary response what I mean.

Raising a child well requires someone who is able to engage in the child’s own theory of mind. If you just treat a child as an information sponge they will need more therapy than usual. A good parent takes interest in their child’s ability to exercise agency.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

countablenewt , 2 hours ago

@zbyte64 you’re getting away from the original conversation

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

zbyte64 , 1 hour ago

Then I guess my original point of agency being an essential element in human learning had nothing to do with your conversation about how AI learns like humans. Carey on.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

countablenewt , 1 hour ago

@zbyte64 we’re saying the same thing

It’s a matter scale, not process

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

zbyte64 , 5 minutes ago

I’m literally saying (an aspect of) process matters, how are we saying the same thing?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

deczzz , 5 hours ago

Agreed. I don’t understand how training LLM on publicly available data is an issue. As you says, it doesn’t copy the work. Rather the data is used as “inspiration” to stay in the art analogy.

Maybe I’m ignorant. Would love to be proven wrong. Right now it seems to me that failing media publishers are trying to do a money grab and use copyright as an argument, even though their data/material isn’t getting illegally reproduced.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

ProxyZeus , 5 hours ago

Suck it, don’t care, go back to obscurity

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

deltreed , 5 hours ago

I don’t mind him using copyrighted materials as long as it leads to OpenAI becoming truly open source. Humans can replicate anything found in the wild with minor variations, so AI should have the same access. This is how human creativity builds upon itself. Why limit AI? We already know all the jobs people have will be replaced anyway eventually.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Zink , 4 hours ago

That’s a good point. AIs/LLMs will exist and will necessarily learn from copyrighted materials without traceability back to the copyright owners to compensate them.

Sounds to me like AIs/LLMs can’t and shouldn’t be proprietary systems owned by private entities for profit, then.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

emmy67 , 4 hours ago

Nor should what they produce be copyrightable in any form. Even if it’s the base upon which an artist builds.

Also, it should all be free.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

exanime , 3 hours ago

Humans can replicate anything found in the wild with minor variations, so AI should have the same access

But that’s not what OpenAI is asking though. They want free access for the type of content you or I need to pay for. And they want it so they can then sell the resulting “variation” they produce

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

BlueMagma , 1 hour ago

That’s not exactly true. They are selling tools for people to recreate with variation.

I propose an analogy: Let’s imagine a company sells brush that are used by painter to create art, now imagine the employees of this company go to the street to look how street artist create those amazing art piece on the ground for everyone to see (the artist does ask for donation in a hat next to the art pieces), now let’s imagine the employees stay there to look at his techniques for hours and design a new kind of brush that will make it way easier to create the same kind of art.

Would you argue that the company should not be allowed to sell their newly designed brush without giving money to the street artist ?

Should all your teachers be paid for everything you produce throughout your life ?

Should your parents gets compensated every time you use the knowledge you acquired from them ?

In case anyone reading is interested by my opinion: I think intellectual property is the dumbest concept, and one of the biggest scams of capitalism. Nobody should own any ideas. Everybody should be legally able to use anyone else’s ideas and build on them. I think we’ve been deprived of an infinity of great stories, images, lore, design, music, movies, shapes, clothes, games, etc… Because of this dumb rule that you can’t use other people’s ideas.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

exanime , 1 hour ago

I propose an analogy: Let’s imagine a company sells brush…

That would be analogous to any content publicly available for free (or via donation). OpenAI wants free access to the art being sold. They also don’t really create the brush, they produced slightly modified versions of the art produced by the artists who does not receive money or credit

Should all your teachers be paid for everything you produce throughout your life ?

They definitely should be paid more. But your analogy is completely off track here since, unlike AI, humans can actually posses and develop intelligence. Not just parrot combinations or the same things we have seen before

Should your parents gets compensated every time you use the knowledge you acquired from them ?

Ok now you are just flailing but even then, yes and most do as it is a general thing that kids take care of their parents when the kids are grown and parents cannot look after themselves

In case anyone reading is interested by my opinion…

This is your best paragraph and I would agree with it. It’s not compatible with capitalism as you allude but I’d be open to radical new thinking

However, that’s is not what’s at play here either. OpenAI wants something we all have to pay for, for free, so they can then resale something else. Worst yet, the value in what OpenAI wants to sell, lies basically on never paying again to the people who produce the stuff it wants for free

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

BlueMagma , 1 hour ago

But then If we agree on IP, we should not complain that openai want free access to copyrighted materials, we should use their own logic to force them to make their model open source, and free for anyone to execute on their own hardware.

They get free access to data so we should get free access to the compilation of the data. Then they can charge us for the hardware cost of running the model, but they’ll have to charge us no more than what it costs, because they will be competing with other company running the exact same model and driving the price down.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Clinicallydepressedpoochie , 3 hours ago

Because Ai is not human creativity… or even close.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

gencha , 2 hours ago

AI is great, what OpenAI does is blockchain-level idiocy.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

nobleshift , 4 hours ago

Get Fucked

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

yournamehere , 4 hours ago

LOooOoOL

thats some napster funny shit

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

PumpkinSkink , 3 hours ago

Yeah, but because our government views technological dominance as a National Security issue we can be sure that this will come to nothing bc China Bad™.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

glitchdx , 3 hours ago

Boo fucking hoo. Everyone else has to make licensing agreements for this kind of shit, pay up.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

aaaaace , 3 hours ago

They already stole my work. No respect.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

dream_weasel , 2 hours ago

The above comment has been consumed by AI for training purposes

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Hobo , 1 hour ago

The above comment has also been consumed by AI for training purposes.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

model_tar_gz , 1 hour ago

The above comment has also also been consumed by AI for training purposes.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

xan , 1 hour ago

I’m shitting and pissing right now. Take that, AI.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Honytawk , 2 hours ago

If your company can’t exist without breaking the law, then it shouldn’t exist.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

General_Effort , 1 hour ago

He has committed the greatest crime imaginable! A crime against capitalism!

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

bonus_crab , 1 hour ago

Copyright =/= liscence, so long as they arent reproducing the inputs copyright isnt applicable to AI.

That said they should have to make sure they arent reproducing inputs. Shouldnt be hard.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Poem_for_your_sprog , 27 minutes ago

Seems the same as a band being influenced by other bands that came before them. How many bands listened to Metallica and used those ideas to create new music?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FatCat , 54 minutes ago

Those claiming AI training on copyrighted works is “theft” are misunderstanding key aspects of copyright law and AI technology. Copyright protects specific expressions of ideas, not the ideas themselves. When AI systems ingest copyrighted works, they’re extracting general patterns and concepts - the “Bob Dylan-ness” or “Hemingway-ness” - not copying specific text or images.

This process is more akin to how humans learn by reading widely and absorbing styles and techniques, rather than memorizing and reproducing exact passages. The AI discards the original text, keeping only abstract representations in “vector space”. When generating new content, the AI isn’t recreating copyrighted works, but producing new expressions inspired by the concepts it’s learned.

This is fundamentally different from copying a book or song. It’s more like the long-standing artistic tradition of being influenced by others’ work. The law has always recognized that ideas themselves can’t be owned - only particular expressions of them.

Moreover, there’s precedent for this kind of use being considered “transformative” and thus fair use. The Google Books project, which scanned millions of books to create a searchable index, was found to be legal despite protests from authors and publishers. AI training is arguably even more transformative.

While it’s understandable that creators feel uneasy about this new technology, labeling it “theft” is both legally and technically inaccurate. We may need new ways to support and compensate creators in the AI age, but that doesn’t make the current use of copyrighted works for AI training illegal or unethical.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

TowardsTheFuture , 26 minutes ago

So the issue being, in general to be influenced by someone else’s work you would have typically supported that work… like… literally at all. Purchasing, or even simply discussing and sharing with others who may purchase said material are both worth a lot more than not at all, and directly competing without giving source material, influences, or etc.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

NikkiDimes , 15 minutes ago

If it is on the open internet and visible to anyone with a web browser and you have an adblocker like most people, you are not paying to support that work. That’s what it was trained on.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Eccitaze , 6 minutes ago

Fucking Christ I am so sick of people referencing the Google books lawsuit in any discussion about AI

The publishers lost that case because the judge ruled that Google Books was copying a minimal portion of the books, and that Google Books was not competing against the publishers, thus the infringement was ruled as fair use.

AI training does not fall under this umbrella, because it’s using the entirety of the copyrighted work, and the purpose of this infringement is to build a direct competitor to the people and companies whose works were infringed. You may as well talk about OJ Simpson’s criminal trial, it’s about as relevant.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

forgotmylastusername , 40 minutes ago

The internet has been primarily derivative content for a long time. As much as some haven’t wanted to admit it. It’s true. These fancy algorithms now take it to the exponential factor.

Original content had already become sparsely seen anymore as monetization ramped up. And then this generation of AI algorithms arrived.

The several years before prior to LLMs becoming a thing, the internet was basically just regurgitating data from API calls or scraping someone else’s content and representing it in your own way.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...