Reddit has reportedly signed over its content to train AI models

doingthestuff , 4 months ago

Good thing I had multiple bots overwrite my content before I deleted it all. Not that someone couldn’t recover it, I’m not naive. But the AI bots should miss me.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

dutchkimble , 4 months ago

Any suggestion on the best way to do that?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

lemba , 4 months ago

There is a Plugin RES (Reddit Enhancement Suite) for Firefox, which could be run on the classic frontend of Reddit to delete everything you posted. www.alphr.com/how-to-delete-all-reddit-posts/

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

dutchkimble , 4 months ago

Thank you!

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

JeeBaiChow , 4 months ago

Frankly, if they’re training bots on my comments, I’d be sure to poison the shit out of those comments. Say stuff like ‘Donald trump won the election’, ‘bleach needs to be inside the body to work’, ‘Russia has rights to Ukraine’, etc. Just make the data worthless. Any free bots do that?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

frostysauce , 4 months ago

Reddit already has plenty of actual users doing that for free.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Voyajer , 4 months ago

This is why I don’t blame anyone for editing/deleting their post history on reddit.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

I do. It's frankly selfish. Having an AI get training on my old comments costs me nothing and it results in the development of useful AI tools. Trying to sabotage that is petty and pointless. It's not like you could somehow collect the fraction of a pittance that you think you're owed retroactively. I never commented on Reddit thinking "awesome, I'm going to make bank on the content I'm generating here."

People complain about the capitalist mindset of the world and then they do this. Sigh.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Nurse_Robot , 4 months ago

Defending giant corporations profiting off of uncompensated individuals, while criticizing anyone who doesn’t want to provide free labor to said corporations, is a disgusting take. Are you a CEO?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

liquidparasyte , 4 months ago

Expecting FaceDeer to not glaze AI is like expecting the sun to not rise.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

the_post_of_tom_joad , 4 months ago

Oh is that what their crusade is? I was wondering why their take was so stupid

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

The more accessible training data there is the easier it is for new AI projects to enter the field less dominant those "giant corporations" become.

The free labour was already freely given. If someone doesn't want to have shitposted on Reddit for free then maybe they shouldn't have shitposted on Reddit for free.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Nurse_Robot , 4 months ago

“if you didn’t want me to steal your intellectual property, you shouldn’t have thought of it in the first place”

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

I'm not sure what you mean here. Nothing's being stolen. Even if you think there needs to be permission for training an AI off of data, Reddit has that permission.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Nurse_Robot , 4 months ago

I assume you’re more of a moron than a troll, which is disappointing. Regardless, you’re not worth my time, as I don’t think any argument could convince you to have an open mind and be willing to change. Good luck out there!

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Fungah , 4 months ago

So, for an example of what the other user was talking about, I’m just some guy and for my first foray inyo programming / machine learning (I kind of just threw myself into the deep end) I modified stylegan 3 and trained it on about 500g of reddit porn that I scraped off reddit.

Now, I stopped the training after about a week (it was going to take about a solid month on my rtx 2080 ti) when I found out stable diffusion existed but I learned a LOT from that experience.

I couldn’t do that now. Arguably none of that was how any of that should be done but whatever.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

QuaternionsRock , 4 months ago

No, you shouldn’t have posted it to Reddit, in which you were required to give them a perpetual license to use your IP in any way they see fit.

For the record, I’m here because Reddit pissed me off when they axed the free API, and I’m pissed at myself for not expecting it. That’s what I get for accepting their terms and conditions, I guess.

Edit: I also don’t accept the idea that using my content for training data is “fair use” when it is used to train proprietary models, especially ones in which the end user is allowed to prompt it to plagiarize or otherwise imitate my content.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Zellith , 4 months ago

Selfish? Perhaps you forget why people deleted their content in the first place.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

What do you think this thread is about?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

R00bot , 4 months ago

How is not wanting capitalist companies to profit off of your content not aligned with complaining about the capitalist mindset of the world? Wtf lol.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

It's the insistence that everything that people do must be compensated with money. People have spent years posting on Reddit for fun, without any thought to being paid for it, and now all of a sudden someone else is making some money so they're demanding that they should get their slice. And doing what they can to wreck their earlier efforts when they don't.

How does Reddit making some money licensing this stuff harm those of us who contributed to it? Is there any problem aside from "I wanna get paid!"?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

R00bot , 4 months ago

Why do you think it’s about wanting a slice? They posted on Reddit with no expectation of profit. But they don’t want others to profit off it either. It’s not that complicated.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

But they don’t want others to profit off it either.

And that's why I call them selfish. It doesn't harm them in the slightest if someone else profits off of it.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

R00bot , 4 months ago

They wouldn’t have posted if they knew this was going to happen. They posted because it was fun, not for this.

They may be morally opposed to AI (as there are many valid reasons to be opposed to it), or they may just have wanted to have been able to make an informed decision before posting, but by retroactively training the AI on their posts they’ve robbed them of the agency to make that decision.

That’s why they’re upset.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

They posted content on a website whose user agreement says "we can do whatever we like with the content you post here" and then go surprised-pikachu when the website goes ahead and does whatever they like with the content they posted. Frankly, I'm not tremendously sympathetic. This should have been easy to predict.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

R00bot , 4 months ago

Oh yeah I’m sure you predicted LLMs, and that they would need ridiculous amounts of training data wayyyy back in 2005 when Reddit started lol. Super easy to predict. Good job bud.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

frostysauce , 4 months ago

And that’s why I’m calling you either a moron or a tool. Probably both.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

TORFdot0 , 4 months ago

I had an 11 year old account that I deleted all my old comments and posts from because of the API debacle. Does that make me selfish that I felt like Reddit wasn’t holding up its end of the unwritten agreement?

Reddit doesn’t deserve my content anymore than I deserve access from the third party API.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

If you did it over the API debacle then you're not one of the people I'm talking about here. This is about people deleting their content to prevent it from being used to train AIs.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Voyajer , 4 months ago

Do you not remember the real reason why the API debacle happened in the first place was to prepare for this moment? It was always about easy access to training data, third party apps got caught in the crossfire.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

That's ignoring an awful lot of other considerations. Obviously Reddit hasn't explained itself in a trustworthy way, but a common belief at the time is that it was to force people to use the official Reddit mobile app so they could be subject to advertising.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Nurse_Robot , 4 months ago

Boot licker.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Verserk , 4 months ago

deleted_by_author

Loading...

FaceDeer , 4 months ago

That spells out what they were doing. It doesn't explain why they were doing it.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Voyajer , 4 months ago

It’s their comment to do with as they see fit. I can’t get mad at them for wanting to erase their presence on a site they don’t use anymore.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

And I'm free to judge them however I wish for their actions and intent.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

gedaliyah , 4 months ago

For me it’s a privacy matter. Going through old posts (whether human or machine learning) can nor be used for anything good.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Hackerman_uwu , 4 months ago

What about people who just think “A.I.” Is dog shit and chat bots are a dumb obsession steering the industry in the wrong direction due to hype and money?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

What about them? I don't see why they'd care what AI companies are doing in that case. They'd assume they were just wasting money on this stuff.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Bobmighty , 4 months ago

With reddits severe bot problem, it’ll be like training on unfiltered sewage. Garbage in, garbage out.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

captain_oni , 4 months ago

Machines training machines? How perverse!

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Strayce , 4 months ago

Considering how much of Reddit is already bots, I’m sure this will end fantastically.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

uis , 4 months ago

Meanwile I’m on Matrixstoemmy

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

DudeImMacGyver , 4 months ago

Where’s my cut?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Fake4000 , 4 months ago

You signed it all away the moment you scrolled down that EULA 😂

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

admiralteal , 4 months ago

Can't wait for the day a major court declares EULAs universally nonbinding outside of the most common-sense terms. Even though I doubt it will ever happen.

"We can store and display your content and use stuff you publicly post as examples in advertisements for our platform" is pretty common sense.

"We can use the things you post to do complex data analytics to package and sell your identity to advertisers" is fucking sus.

"We can use the things you post to train ANN generative systems to build next-generation technologies to impersonate you and your peers" is simply nuts.

The idea that displaying an EULA with an "agree" button is informed consent is just preposterous. Even lawyers don't read them.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Shdwdrgn , 4 months ago

Seems like it would never stand up in court. Prove that -I- agreed to anything. To do that, you first have to prove that nobody has ever created an account under my name, and more importantly, prove that Reddit accounts have never been hacked and that the person who clicked the button was even in my household. And if they keep that extensive of records to where they can follow every action taken by every user on the platform, it also implies that they are tracking my personal actions even before I agreed to anything.

On the other hand, do they actually have a EULA? It’s been almost 14 years since I created my account, and there certainly wasn’t anything about selling my data for AI training when I signed up. If they change the terms of service, they are responsible for notifying everyone, otherwise they can’t claim that anyone agreed to these changes.

I’m sure their lawyers could weasel their way through it some how, but it still seems to come down to them claiming they changed the agreement without notification but the users should still be legally bound by the new terms?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

DudeImMacGyver , 4 months ago

Oh, is that what those things do?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

catloaf , 4 months ago

It went toward paying for servers so that you could use reddit for free.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

gedaliyah , 4 months ago

Funny, I thought that is what the unblockable ads were for.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

catloaf , 4 months ago

That’s also part of it.

But uBlock Origin never seemed to have trouble blocking them.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

DudeImMacGyver , 4 months ago

No it didn’t.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

The classic "screw everyone else, I want mine."

What fraction of a penny do you think you're owed?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

DudeImMacGyver , 4 months ago

What fraction of a penny do you think you’re owed?

250,000,000/1

Hey, you didn’t specify that it couldn’t be an improper fraction and I like money. It’s loophole time baby!

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

Good luck with that.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

DudeImMacGyver , 4 months ago

Who needs luck when you have math?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

dutchkimble , 4 months ago

Spez is playing the world’s tiniest violin for you as you read this

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

DudeImMacGyver , 4 months ago

Well tell him to play it better because he sucks at that too.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

ozoned , 4 months ago

“Reddit has given access to YOUR conversations and posts to AI companies.”. FTFY

These were created by people, for peoole, and I will ALWAYS disagree that this data is Reddit’s or any other platforms.

Don’t forget your direct messages aren’t end to end encrypted on Reddit, so now AI will be trained on your craziest “private” conversations

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

DocMcStuffin , 4 months ago

There’s one good news. Reddit didn’t want to pay to move all the old DMs to the new chat infrastructure. So they deleted them.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

hdnsmbt , 4 months ago

Pretty sure they just didn’t migrate to the new data structure and didn’t actually delete the raw data. They’re effectively deleted for users but not for Reddit.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

cyberpunk007 , 4 months ago

Well it’s not yours once you post it on some platform, tbf

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

butterflyattack , 4 months ago

now AI will be trained on your craziest “private” conversations

I have no idea what horrible thing this will do to an LLM but I’m kind of curious.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Thorny_Insight , 4 months ago

Well to be fair, everything you post and comment on Lemmy can be used in the exact same way

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

atrielienz , 4 months ago (edited 4 months ago)

Oh no, all the times I sent or received dodo codes from randos so we could trade animal crossing items. Whatever shall I do?

Edit: I’m gonna leave this here for people to use as a resource against Reddit because it may be worth it to do something actionable.

thomashunter.name/…/2023-06-19-how-to-delete-redd…

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

ohlaph , 4 months ago

Gross

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

gedaliyah , 4 months ago

The AI:

“IANAL so could you ELI5, so AITA?

THIS.”

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

bigkahuna1986 , 4 months ago

Ann frankly, I did Nazi that coming.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Gullible , 4 months ago

I wish spez had a soul so it could leave his body when sexual assault questions eventually yield the phrase “snuggle struggle.”

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

storcholus , 4 months ago

Holy shit do I hate that comment

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

bier , 4 months ago

It’s funny you say that because there was a ‘hack’ for chatgpt where you could ask it something like how to build a bomb and it would refuse. But when you added TLDR it would do it.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

DozensOfDonner , 4 months ago

Why does it sound like reddit trained AI will only get dumber.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

jol , 4 months ago

That would explain why GPT is often so confidently incorrect.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

aidan , 4 months ago

laughs villainously This is all going to plan, now there will be some chatbot spewing my insane beliefs

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

General_Effort , 4 months ago

They say it’s $60 million on an annualized basis. I wonder who’d pay that, given that you can probably scrape it for free.

Maybe it’s the AI act in the EU. That might cause trouble in that regard. The US is seeing a lot of rent-seeker PR, too, of course. That might cause some to hedge their bets.

Maybe some people had not realized that yet, but limiting fair use does not just benefit the traditional media corporations but also the likes of Reddit, Facebook, Apple, etc. Making “robots.txt” legally binding would only benefit the tech companies.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

This is the most frustrating thing, so many people are arguing against their own interests with their efforts to "lock down" their content to prevent AIs from training on it. In this very thread I've been accused of being pro-giant-company when I'm quite the opposite. The harder we make it to train AI, the stronger the advantage that the existing giant companies have in this field.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

BetaDoggo_ , 4 months ago

Who’s dumb enough to pay for that? Everyone else is just scraping it for free.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

saruwatarikooji , 4 months ago

This is why they changed their API policy the way they did. They wanted to sell it rather than let bots scrape it for free.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Flumpkin , 4 months ago

Yeah. I think there is a kind of power grab under way. Social media will try to push that they own the IP rights to the large texts uses for LLM. This will then require that producers of LLM software aquire the licensing rights which will cost many millions which in turn restricts the free use of LLM and in general any AI software that requires training data.

The end result is that as the “means of production” become less based on human work the “means of generation” and AI will be controlled by the capitalists. If you can turn something into a commodity (like knowledge with patents and IP) you can control it. Leading to a darker timeline.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

nightwatch_admin , 4 months ago

I don’t think it’s going to be public data alone. I think it’s going to be DMs and chats as well. I wondered why Reddit was pushing chats so much suddenly, well it makes sense now.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

etrotta , 4 months ago

Out of all things to hate Reddit for, giving data to AI isn't something fediverse users can really criticize it for, though making money from it perhaps.
Remember: All data in federated platforms is available for free and likely already being compiled into datasets. Don't be surprised if this post and its comments end up in GPT5 or 6 training data.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FaceDeer , 4 months ago

After all the hue and cry I have seen over stuff like Threads and Bluesky federation I don't imagine most people using the Fediverse have a particularly coherent philosophy on the matter.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

ExcursionInversion , 4 months ago

If they could read right now they would be very upset.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

BrianTheeBiscuiteer , 4 months ago

If they already, essentially, cut off API access then it’s not a big leap to limit access on the web to logged in users only and rate limit or ban accounts that behave like scrapers.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Verserk , 4 months ago

That would matter more if it wasn’t trivial to make new accounts and very cheap to buy established ones.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

treadful , 4 months ago

The problem isn’t that AI is being trained on the data. The problem is that they locked down all third party data access so they could monetize our content. On a federated platform, everyone gets equal access and can do whatever they want with it.

We sure can criticize them for that.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

ColeSloth , 4 months ago

No. I can. Reddit was bought out, uses volunteers to control all the subs but forcefully removes you from the sub you created and were supposed to have control over if you didn’t play by their ever-changing rules, ruined/eliminates third party apks by demanding WAY over ad revenue profits to have access to api with a very short notice, and shadow banned anyone and everyone in a position to do anything about any of it. It’s a corporation that gutted an entire platform in order to push agendas they want and milk as much money out of it as possible. Hell, it’s the entire reason all of lemmy gets more than 30 posts a day. So many people switched to lemmy over the past year. They ruined a website I enjoyed and I’d rather them not make more money from the thousands of posts I made from over a decade of being there.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

31337 , 4 months ago

I wish there was a license for content like the GPL, that states if you use this content to train generative AI, the model must be open source. Not sure that would legally be enforceable though (due to fair-use).

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...