There have been multiple accounts created with the sole purpose of posting advertisement posts or replies containing unsolicited advertising.

Accounts which solely post advertisements, or persistently post them may be terminated.

Contend6248 ,

Did they fucking recover deleted messages?

I can find every post after automatically editing and deleting them afterwards

Hackerman_uwu ,

Politics and whatever the other maladies are that infect Reddit aside: the app and the website are impossibly difficult to use. Ita just ads ads ads. Even if the place was a veritable utopia that’s a no from me dawg.

gapbetweenus ,

Reddit is such a nice example of capitalism turning a genuinely nice thing into a pile of garbage.

RizzRustbolt ,

Ah… so they’re shorting it.

HerbalGamer ,
@HerbalGamer@sh.itjust.works avatar

As in short circuit?

You999 ,

Short selling, Wallstreet thing for making money when the stock. It kinda works like borrowing a stock and immediately selling it, then we it’s time to return the stock you borrowed you buy a replacement and if the price has gone down you keep the profit.

PotatoesFall ,

shorting means to bet against a stock. instead of buying a stock and waiting for it to grow over a long period and selling it for more money (long position), you borrow the stock from somebody, sell it, then buy it back (for a lower price) after a short amount of time to give back to whom you borrowed it from. (short position). If the stock price rapidly drops in that time, you gain money.

If this sounds like a perversion of what investing was supposed to be, yeah welcome to wall street

zecg ,
@zecg@lemmy.world avatar

I still wouldn’t use it, but Elon should buy it so he can more convincingly posture a saviour of independent social media. Just drop the tracking, sell square ads on the sidebar based on content only, open up api for all sorts of uses and make rif is ridiculous fun for reddit the official client. Give it to five geeks to maintain and call it a public service. What’s a bit more down the drain? It’s all gonna go Midas anyways. At least that would be a win for the internets

Grandwolf319 ,

Everyone here are thinking $5 billion means Reddit won. Wasn’t their evaluation at $15 at one point?

Threeme2189 ,

If the went from fifteen bucks to five billion, yes they totally won 😉

Grandwolf319 ,

Well I guess I fully set myself up for that one lol

stoly ,

If you mean that an entitled douchecanoe who has no trouble allowing genocides to occur on his service getting stupid rich, then they did win. If you are looking at how much they could have gotten, then, yeah, they suck.

Meowoem ,

Sorry, what genocide happened on Reddit?

realitista ,
@realitista@lemmy.world avatar

The Great App Genocide of 2023.

Zuberi ,
@Zuberi@lemmy.dbzer0.com avatar

Lol at the 50% haircut since the API shit.

Can’t wait to release this ransomware data this year 😍

noctisatrae ,

You wrote a ransomware?

Zuberi ,
@Zuberi@lemmy.dbzer0.com avatar

Me? Never. Idk how to even turn on my Mac.

KingThrillgore ,
@KingThrillgore@lemmy.ml avatar

The haircut isn’t done yet…

stoly ,

LOL if you listen to some angry people in other threads, they will claim that not a single thing has changed.

ulkesh ,
@ulkesh@beehaw.org avatar

Would love to see this become the fastest IPO to tank.

Moira_Mayhem ,

Yeah but then when it collapses all the roaches will scurry off to other forums, including lemmy .

Personally I want it as a place to sequester the hard and alt right, like 4chan was before the digg exodus.

Every time someone pops the infection caul where they gather, it just spreads their toxic juice everywhere else.

ulkesh ,
@ulkesh@beehaw.org avatar

Sadly I’ve found a few of those on Lemmy already. What I’d prefer is the internet of the 1990s when it took effort to figure out and the masses were too lazy or dumb to do so.

And that sweet, sweet dialup modem sound.

Moira_Mayhem ,

Fuck yes those memories…

I used to purposefully disable the error correction on my modem so the connection sound didn’t have that stupid hissy wobbly beepy part at the end. Yeah it meant that sometimes I’d fjhkswa thjasd ar eee e but it was worth it for that pure music.

Maybe we’re blessed for knowing it like none that came after, or cursed for knowing it will never be that good again…

nicetriangle ,

Sure feels like they timed this IPO pretty badly. I think the ideal time to strike on this would have been a few years ago... Based on market conditions anyway. Reddit itself may just not have had their ducks lined up enough, but that's their problem, not the stock market's.

  • Tech stocks trading sideways for the last year or two
  • The interest rate money printer got shut off and cash is not cheap anymore
  • Seemingly all the major new tech stock investment interest is circling around stuff like AI
  • Federated alternatives are slowly building steam and people seem to have gotten pretty salty about corporate social media
  • The pandemic is more or less over and people have pulled back from being chronically online somewhat (this is my guess, I don't have data to back it up)

Also what exactly is the monetization strategy? Ads I guess? More catering towards creating corporate "synergy" with the Reddit community? Selling user data/content? So basically making the place suck considerably worse for users is what it looks like to me.

Ross_audio ,

Monetisation?

Licensing the site to AI when there’s finally a ruling they can’t just scrape the internet for training data while ignoring copyright.

Corkyskog ,

I was told reddit has already been scraped for AI and all sorts of stuff. There is very little new value to sell.

nicetriangle ,

Yeah that was kinda my understanding too. And regardless of my feelings on it, I think rulings are mostly gonna go in AI’s favor.

Ross_audio ,

Except AI models may end up having to start again with licences or public domain data.

They are currently breaking the law and delaying legal action as long as possible in the hopes they can repeat the trick with a new data set.

Tak ,
@Tak@lemmy.ml avatar

Corporations break the law all the time and typically it’s just an operational expense.

Ross_audio ,

Typically they aren’t fighting other corporations.

Tak ,
@Tak@lemmy.ml avatar

I don’t understand what you’re saying because I never said they were.

Ross_audio ,

My point is that corporations often see a fine as a cost of business because the fines are issued by a regulatory system that has no teeth.

If you’re in a lawsuit against another corporation they are going after damages in civil court and it’s likely to be a high enough fine to stop the behaviour.

besbin ,

Whatever already existed won’t be thrown away regardless of the ruling. It’s like throwing all the gold already dug up just because it was done by slave labor. The law and legal actions are mostly just a moat around the pile of gold already dug up. Sure AI companies will have to pay more for the new data from other sources. However that would be peanut compared to how much they will have to pay starting from zero.

Ross_audio ,

If every time what already exists gets used there’s a risk of a massive fine or court case they’ll throw it away.

The game now is to delay the legal process long enough until they’ve built the replacement.

Then they can afford to throw the, essentially faulty, model away.

fine_sandy_bottom ,

It’s not at all clear that the current model does breach the law.

If it was a court would have issued an injunction or whatever.

Ross_audio ,

It’s clear from the output that it breaks copyright.

We don’t have to look inside the black box to demand to see the input which caused that output.

To be clear a machine is not responsible for itself. This machine was trained to break copyright.

fine_sandy_bottom ,

Generally if someone is clearly in breach of copyright the rights holder will apply to a court to issue an injunction to order that company to cease their activities until a case can be resolved.

Given that has not happened, it seems that from a court’s perspective, it’s not a clear breach of copyright.

Ross_audio ,

The rights holder first considers the size of the payout vs. the cost of legal fees.

Just because they haven’t been sued directly for this doesn’t make it infringement.

fine_sandy_bottom ,

Nonsense. If this is copyright the payout will be many billions. They’ve had a year to think about it.

Ross_audio ,

The statute of limitations is much longer than a year. It’s usually around 5.

They can wait, see who’s made the money, then target them for a payout.

fine_sandy_bottom ,

A court wouldn’t look favourably on that.

Rights couldn’t have been very b important if you just let it run.

Ross_audio ,

They really don’t care. It can take a lot of time to put a solid case together and you’re better off having a solid case than a quick trial.

diffuselight ,

No they’ll train on laundered model output. Like every llama.

The investment thesis they the data is valuable is bonkers. It’s not. Not only has it been exfiltrated and can be laundered in a dozen ways, Reddit also won’t be able to effectively assert copyright.

Look at Facebook. It’s full of reposted quora content now with AI images and AI laundered text.

Reddit is dead

JillyB ,

Federated alternatives are slowly building steam and people seem to have gotten pretty salty about corporate social media

I think you’re overselling the importance of this one. When I’ve talked to friends about federated alternatives, they really aren’t interested. Even if they hated Twitter/reddit and think they’ve gotten worse, they just don’t really care about a federated alternative. I’ve heard some interest in threads, so maybe we count that?

dameoutlaw ,
@dameoutlaw@lemmy.ml avatar

Yeah, people don’t really care about decentralisation nor federation. People want an easy experience where everyone is

Moira_Mayhem ,

If they really understood the phrase ‘too many cooks spoil the soup’, then they’d realize the advantage of smaller online communities.

Reddit was at its best when it had a low count but engaged userbase, and became actively worse as it grew.

I think this is because trolling and response isn’t a 1 to 1 ratio. All it takes is 1 toxic person to make an entire subforum rancid and takes the effort of several mods to mitigate it.

The more people you have, the more chance you will have these trolls organize, the more likely they will either overwhelm or infiltrate the mods.

nicetriangle ,

Yeah I tend to agree. I think all communities have a critical mass and past that point they go downhill.

I was just googling for the rat overpopulation experiment because I think it works as a great example of this and it turns out this whole concept has a term.

https://en.wikipedia.org/wiki/Behavioral_sink

Moira_Mayhem ,

I think a better metaphor is fermentation.

It happens naturally whenever the ingredients are brought together but in order to get a quality product you need ridiculous amounts of knowledge, process, and technology.

And even a tiny bit of the wrong bacteria can ruin an entire batch, but people will still drink it and go blind.

nicetriangle ,

Regular people didn't know or care wtf reddit was for quite a while also and there absolutely is a building friction between people and corporate social media. We're in the early stages for now, but stuff like Activitypub is not going away.

PotatoesFall ,

Even my most alternative, vegan, communist friends agree with me when I pitch the fediverse and then flock to capitalist social media like moths anyway. It’s disheartening.

PotatoesFall ,

They already make money with ads. Killing third party apps was part of this, now they can control exactly how you see ads. It’s the same as any other social media now, they recommend you content, which is exactly not the point of reddit.

nicetriangle ,

Reddit's been running ads for a while and has never turned a profit

Cratermaker ,

I’ve already left, but seeing them marching towards an IPO makes me even happier with my decision. I just fear that the mountains of helpful troubleshooting and advice on Reddit will be locked away forever soon, while the rest of the web falls to SEO and AI-generated nonsense text…

LittleBorat2 ,

Reddit Was the only site Google could effectively search. Rip googling questions and adding reddit.

AnonStoleMyPants ,

Man, and it works great. It is waaaaay more common to find good answers to a question from a bunch of randoms on the Internet than trying to get an actual answer from a random website. Sometimes you find bs but you can usually quite quickly filter it out, and it gives a good basis from which to then continue to search on the topic.

Admetus ,

It was speedier and usually more effective than forums which crept at a snails pace.

Death_Equity ,

I hate that Reddit is so good for answering questions. The alternatives are usually AI written unhelpful trash.

Empathy ,

I’m trying out Kagi a little bit, and it has a federation search mode of some kind. I tried it for a search and it gave me results from Lemmy.

I don’t know yet how Kagi compares to Google in terms of results quality, I barely used it so far. It’s pretty expensive though.

AnonStoleMyPants ,

Oh yeah that’s true, I should try it as I use it as my main search engine nowadays.

pixxelkick ,

I 100% can see it easily selling for that much.

You want to know why it’s worth that much?

Petabytes of raw training Data for LLMs. Arguably atm reddit us one of the better gold mines of LLM training data on the internet, bazillion of posts already formatted as post-response chains, which us the exact type if format an LLM wants to train on.

Can you imagine how valuable those servers loaded with posts are to a company like OpenAI, Google, or Microsoft?

5 billion is quite reasonable to harvest every reddit post that has ever been made ever and cut it off from your competitors.

ajsadauskas ,
@ajsadauskas@aus.social avatar

@pixxelkick @ardi60 Well, if anyone wants to buy it for that purpose, then I just hope they remember to screen out the more NSFW parts of Reddit.

Otherwise, their bots are going to start giving some rather unfortunate responses to customer questions...

tryptaminev ,

I am looking forward to the hilarity of it for a while though.

“Cooking bot, i have found this cucumber i need to use before it gets bad. What can i do with it?”

“Shove it up your rectum”

Could lead to a lot of interesting lawsuits and let a lot of MBA bros look rather stupid.

pixxelkick ,

Most LLMs have tonnes of NSFW data in their training.

Typically, if this wants to be blocked, a secondary RAG or LORA is run overtop to act as a filtering mechanism to catch, block, and regenerate explicit responses.

Furthermore, output allowed lexicon is a whole thing.

Unfiltered LLMs without these layers added on are actually quite explicit and very much capable of generating extremely NSFW output by default.

MajorMajormajormajor ,

The worst part is that ai chatbots will start responding like redditers. I can’t wait for chatgpt to regale me with a story about his dad beating him with jumper cables, or jolly ranchers, or hell in a cell.

Thorry84 ,

And then everyone clapped

isthingoneventhis ,

And then asked “AITA?”

marx2k ,

And took an arrow to the knee

Stern ,
@Stern@lemmy.world avatar

Thanks for the gold

marx2k ,

Nonstop “this”

DinosaurSr ,

This

duncesplayed ,

Has reddit not already been scraped? With all of that information exposed bare on the public Internet for decades, and apparently so valuable, I find it hard to believe that everybody’s just been sitting there twiddling their thumbs, saying “boy I sure hope they decide to sell us that data one day so that we don’t have to force an intern to scrape it for us”.

sheogorath ,

I asked Bard whether they could make a bastardization of rogersimon10 post and it made this:


<span style="color:#323232;">Ah, the 2012 Fiesta EcoBoost. A pocket rocket some called it, a ticking time bomb I preferred. Owned one myself, sunshine yellow with a dent shaped like a screaming Muppet (courtesy of a rogue shopping cart). Let me tell you, that car was less a vehicle and more a high-octane game of Russian roulette with pistons.
</span><span style="color:#323232;">
</span><span style="color:#323232;">Every morning was a gamble. Would she purr to life like a hungover kitten, or unleash a symphony of coughs, bangs, and the ominous "Check Engine" light disco? One day, the radio spontaneously switched to polka, which, let's be honest, is only two steps away from a thermonuclear detonation in a Fiesta's internal combustion fiesta.
</span><span style="color:#323232;">
</span><span style="color:#323232;">Speaking of fiestas, the fuel gauge was as reliable as a politician's promise. One minute it was brimming, the next it was flashing "Bone Dry, Mate," even if I'd just filled it. Learned to carry a jerrycan like a Prohibition-era bootlegger, never knowing when the engine's thirst would strike.
</span><span style="color:#323232;">
</span><span style="color:#323232;">But oh, the things that little lemon could do! On a good day, that turbocharged engine was like a caffeinated hummingbird on steroids. Weaving through traffic, leaving Audis in the dust (until their turbos spooled up, of course). Handling? Razor sharp, unless you hit a pebble, then it was suddenly a drunken ballerina on roller skates.
</span><span style="color:#323232;">
</span><span style="color:#323232;">Comfort? Ha! The seats were about as plush as a park bench, and the suspension was tuned for "Feel Every Pebble in the County." But hey, it kept you awake, which was essential considering the constant fear of spontaneous combustion.
</span><span style="color:#323232;">
</span><span style="color:#323232;">In the end, I sold the Fiesta to a circus clown for a fistful of juggling pins and a lifetime supply of those tiny, uncomfortable hats. He seemed thrilled, probably because the polka radio was a bonus feature for him.
</span><span style="color:#323232;">
</span><span style="color:#323232;">So, would I recommend a 2012 Fiesta EcoBoost? Only if you enjoy existential dread, questionable fuel efficiency, and the thrill of living life on the edge (of a breakdown). Otherwise, stick to something safer, like a heavily sedated hamster. You'll thank me later.
</span><span style="color:#323232;">
</span><span style="color:#323232;">Now, if you'll excuse me, I have a date with a tow truck and a very suspicious mechanic who keeps asking about "jumper cables." Wish me luck.
</span><span style="color:#323232;">
</span><span style="color:#323232;">P.S. Don't forget the jumper cables. Seriously. You'll thank me later.
</span>
TheRealKuni ,

That’s remarkable.

pixxelkick ,

Scraped data isn’t legal to resell, scraping isn’t even legal in the first place.

Just because you can scrape the data doesn’t mean it’s worth anything.

Companies like MS, Google, OpenAI, FB they make money by selling the usage of their LLM services to other companies who then they use that service to make their own products.

If it came to light that MS/Google/OAI/FB were using illegal training data for their LLMs, it would get all those other companies hit in the crossfire.

So these companies have to do a shit tonne of diligence to assure their investors and clients that their LLMs are purely trained on legally obtained data and are safe to use.

And you know what is a super easy way to assure them of that?

If they literally own the original data themselves

duncesplayed ,

Scraping is legal

Have you been following any of the court battles involving LLMs lately?

The New York Times suing OpenAI. Getty Images suing Stability AI. Sarah Silverman and George R.R. Martin suing OpenAI.

All of those cases involve data that has been scraped. (In the latter two cases, the memoir/novels were scraped from excerpts and archives found online).

It’s too late to say with complete certainty that it’s all legal (the appeal processes haven’t all been finished yet), but at this point it looks like using scraped and copyrighted data in training LLMs is legal. Even if it’s going to turn out not to be legal, it’s very clear that nobody’s shying away from doing it, because we have the courts showing as a statement of fact that it’s been happening for years.

Everything you’ve written is just fantasy. We have a lot of reality which contradicts it. Every LLM company has been primarily relying upon scraping data (which we know to completely legal) and has been incorporated copyrighted and scraped data in its data sets (which is still legally a grey area, but is happening anyway).

pixxelkick ,

NYT hasn’t actually won that case yet, so it’s pointless to bring up. OpenAI has publicly stated that NYT heavily has misrepresented their findings.

OpenAI’s value would plummet and crash if they gained a reputation for using illegal material to train their AI on, investors would drop them so fast.

This is just a simple fact. LLM providers reputation is heavily staked on the legality of their data.

So far the courts have ruled in these companies favor.

But it’s extremely likely illegaly scraped Dara from reddit would not pass the sniff test and debestate an offending companies reputation.

If you don’t understand why, you have to do some brushing up on why these LLM services are worth so much and who is using them and for what. Once you understand that, it becomes extremely apparent why legally owning the entire history of every reddit post ever would be extremely valuable, and why a 5bil price tag is actually not that crazy.

LittleBorat2 ,

This data was out in the open for a decade and still is. People could train their llm without problems.

null ,

Exactly. Why is this being upvoted?

Admetus ,

They just started charging for API usage so isn’t it already not open?

Corkyskog ,

And then everyone started deleting accounts, comments and even rewriting and poisoning their comments. The data was way better before the API change.

pixxelkick ,

Do you actually think this has any impact? That’s silly.

Reddit’s servers have the original copy of every single post made, undoubtedly, and everytime you edit your post, they store that copy too.

So not only has everyone “poisoned” their data ineffectively, they literally have created training data of “before” vs “after” poisoning to compare the two for training the LLM against poisoned data.

Whoever buys the right to that is going to have a pretty huge goldmine, and perhaps they will rent it out, or perhaps they’ll use it themselves.

pixxelkick ,

Not legally / free.

And yes, that very very much matters if you intend to actually sell the service to companies that they themselves dont want to get hit in the crossfire of potential lawsuits for building their products on top of stolen info.

So if you can own the data itself (via buying reddit), you now have an ENORMOUS quantity of prime training data that you’re investors and potential customers know is legally clean, because you literally own it.

SpicyLizards ,

Sell it to musk, finish the job

photonic_sorcerer ,

Can’t wait for him to rebrand it as Y

HeavyRaptor ,

Or 3

duncesplayed ,

Let’s not rule out Æ

SpicyLizards ,

Nah, sounds “like a password”

JoeKrogan ,
@JoeKrogan@lemmy.world avatar

hunter2

0x2d ,

correcthorsebatterystaple

Admetus ,

Password hint: ‘p…’

LittleBorat2 ,

Can’t they blow Musk up by accident or send him on a rocket to Mars that orbits forever there (oops)? Something like the submarine last year but better.

SpicyLizards ,

Surely we now have the technology

jubilationtcornpone ,

$5 Billion for a chronically unprofitable, niche social media platform in an already crowded field. Yeah, Ok. You’d get a better return on your investment if you just burned the money for heat.

diffcalculus ,

“niche” website in a crowded field…that is also the top 9th most visited website in the world, above tik tok.

…m.wikipedia.org/…/List_of_most-visited_websites

spez_ ,

What’s Reddit?

LittleBorat2 ,

I love how xvideos comes before x/shitter 😁

jubilationtcornpone ,

Reddit is niche when it comes to social media. It caters to a particular group of people and has its own style of both content and engagement, just like Facebook or Tiktok have their own styles. I would argue reddit is, in some ways, more like an old school forum with a fresh coat of paint. It requires more effort from the user to engage with than some other social media platforms. The content can be a lot “heavier” and it’s not centered around people and/or personalities. To be clear thats something I liked about Reddit but I don’t think it really resonates as much with the average user.

Site visits are just one metric and, while it’s an interesting metric, it doesn’t say much without a lot more context. OK, so a lot of people end up visiting Reddit. Why? Is it intentional? Is it because every third Google result is a reddit post? If so, is it driving further engagement? If not, then that benchmark is worth little. If it is driving further engagement, then something else is wrong.

OfficerBribe ,

Is reddit really considered social media now? I always have described it as the biggest forum and site aggregator on internet.

I know at some point avatars and some crypto? BS came, but that did not change site fundamentals, reddit to me still feels like the same site I joined 10 years ago especially because I used same reddit client for majority of those 10 years. Only reason why I abandoned it is the closure of free API. That and the fact I dislike management, but if 3rd party clients would not have been nuked like that, I would have probably stayed.

Now I am using same client I used for reddit - Sync, so from UI perspective nothing has changed.

helenslunch ,
@helenslunch@feddit.nl avatar

They havent been profitable in a decade. What’s their plan for monetization?

Tangentism ,
jamyang ,

Valuable training data for AI. Spez does not give a damn about it. He gladly takes money and sell it off to someone like OPEN AI whose models can be trained on its data.

“I trained on millions of users’ data and now I ape them. AITA?”

capital ,

For ads? How else do they make money?

golden_zealot ,
@golden_zealot@lemmy.ml avatar

Selling user data, selling top posts and comments to corporate marketing accounts, selling control of dialogue about any subject to sway public opinion, reddit gold, making the platform more ass.

capital ,

Good points.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • [email protected]
  • random
  • lifeLocal
  • goranko
  • All magazines