Vandalism is always reverted on SO, even if done by the original author. No knowledge is lost. Suing OA for violating the CC-BY license might be possible, but I’d wager SO is not interested in suing them, and since they hold the rights, not much can be done by others.
And while it hurts now, it’s REALLY going to hurt when large swaths of useful answers that don’t exist anywhere else are gone and there’s nothing replacing them.
Noone writes hundreds of pages of documentation for their stuff anymore. Without the collected knowledge learned from experience there, what do we have?
Unless we have source code to read, very little.
I’m still feeling the pain of google search results sucking combined with most of the large coding forums being gone and reddit slowly going to garbage. Stack Overflow was the last bastion of collected knowledge of it’s type… and it’s not like it was 25 years ago where we still had phonebook-sized manuals for almost all major software because agile has killed the concept of exhaustive definitive documentation for a given version of something.
I used to sorta roll my eyes at people shouting about federating everything, but at this point I’m scared and agreeing with them.
Stack Overflow was great when it appeared. The info was spread out incredibly wide and there was a lot of really shitty info as well. One place where it could accumulate and be rated was extremely helpful.
But maybe it’s time to create a federated open source Stack Overflow.
I once managed to find a pretty good alternative, but then I forgot its name. It was a very chill community unlike what Stackoverflow was recently with it’s toxicity (properly formatted question police, people being offended for less popular languages, etc.).
I don’t know. It feels a bit like “When I quit my employer will realize how much they depended on me.” The realization tends to be on the other side.
But while SO may keep functioning fine it would be great if this caused other places to spring up as well. Reddit and X/Twitter are still there but I’m glad we have the fediverse.
The company’s get hit hard by unplanned vacancies. It won’t take them down, but it can cost them buckets of money in either expenses, lost revenue or both. The thing is, the people that left will never know that, there coworkers will never see it, only people in finance and budget will know how to quantify the impact.
You can be killed with steel, which has a lot of other implications on what you do in order to avoid getting killed with steel.
Does steel fuck it all up?
Centralization is a shitty backwards idea. But you have to be very conscious of yourself and your instincts to neuter the part that tells you that it’s not to understand it.
Distributivism minus Catholicism is just so good. I always return to it when I give up on trying to find future in some other political ideology.
This has nothing to do with centralization. AI companies are already scraping the web for everything useful. If you took the content from SO and split it into 1000 federated sites, it would still end up in a AI model. Decentralization would only help if we ever manage to hold the AI companies accountable for the en masse copyright violations they base their industry on.
Copyright is an artificial, government given Monopoly.
Market Mechanisms don’t work when faced with a Monopoly or work badly in situations distorted by the presence of a Monopoly (which is more this case, since Stack Overflow has a monopoly in the reproduction of each post in that website but the same user could post the same answer elsewhere thus creating an equivalent work).
Pretty much in every situation where Intellectual Property is involved you see the market failing miserably: just notice the current situation with streaming services which would be completelly different if there was no copyright and hence no possibility of exclusivity of distribution of any titles (and hence streaming services would have to compete in terms of quality of service).
The idea that the Free Market is something that works everywhere (or even in most cases) is Politically-driven Magic thinking, not Economics.
Market forces lead to the creation of large corporations that then shut down market forces and undermine fair markets. Once a few big corporations dominate they coordinate their behavior and prices and shut down any new players entering the market. Regulation can counter it to a point, but once the corporations are wealthy enough to dominate government regulation also fails. Right wingers hasten the process by opposing regulation, and have no good answer to how to prevent markets collapsing into monopolies or cartels. I’m not sure anyone has a good answer to that in a capitalist system.
This has everything to do with centralization, just not with the one small context for it which you picked.
With real decentralization in place market mechanisms work.
Monopoly situations along with market mechanisms invariably result in centralization (“monopoly” comes from the Greek word for “right of exclusive sale”), hence market mechanism won’t “work” in the sense you mean it in such a scenario, as I explained.
Your argument is circular because it’s like saying that it will work as long as it creates the conditions to make itself work (which is the same as saying “as long as it works”).
Decentralization and distribution should be enforced, yes.
By, for example, institutionalized resistance to anything like IP law, to regulations and certifications allowing bigger fish to call those who can’t afford them, and at the same time by maintaining regulations against obvious fraud.
It’s not a circular argument, you’re just not paying attention.
The friendliness of political systems to decentralization doesn’t correlate much with their alignment in terms of left\right or even authoritarian\libertarian. So in my opinion this should be a third dimension on that political compass everybody’s gotten tired of seeing. And there are many other dimensions to add then, so useless.
You realize that there have been multiple websites scraped, right? So decentralizing doesn’t solve this issue in particular. Especially when federated sites like Lemmy provide a view of the entire fediverse (more or less).
Just because something is available to view online does not mean you can do anything you want with it. Most content is automatically protected by copyright. You can use it in ways that would otherwise by illegal only if you are explicitly granted permission to do so.
Specifically, Stack Overflow licenses any content you contribute under the CC-BY-SA 4.0 (older content is covered by other licenses that I omit for simplicity). If you read the license you will note two restrictions: attribution and “share-alike”. So if you take someone’s answer, including the code snippets, and include it in something you make, even if you change it to an extent, you have to attribute it to the original source and you have to share it with the same license. You could theoretically mirror the entire SO site’s content, as long as you used the same licenses for all of it.
So far AI companies have simply scraped everything and argued that they don’t have to respect the original license. They argue that it is “fair use” because AI is “transformative use”. If you look at the historical usage of “transformative use” in copyright cases, their case is kind of bullshit actually. But regardless of whether it will hold up in court (and whether it should hold up in court), the reality is that AI companies are going to use everybody’s content in ways that they have not been given permission to do so.
So for now it doesn’t matter whether our content is centralized or federated. It doesn’t matter whether SO has a deal with OpeanAI or not. SO content was almost certainly already used for ChatGPT. If you split it into 100s of small sites on the fediverse it would still be part of ChatGPT. As long as it’s easy to access, they will use it. Allegedly they also use torrents for input data so even if it’s not publicly viewable it’s not safe. If/when AI data sourcing is regulated and the “transformative use” argument fails in court and if the fines are big enough for the regulation to actually work, then sure the situation described in the OP will matter. But we’ll have to see if that ever happens. I’m not holding my breath, honestly.
The irony is that folks complain about stuff like Discord partly because it cannot be scraped by search engines but that would also protect it from being scraped by AI tools.
Hmmm, maybe the Catholic part isn’t the only part worth reviewing.
Also worth noting that the Conservative Party’s ‘Big Society’ schtick in 2010 was wrapped in the trappings of distributism.
Not that all this diminishes it entirely but it does seem to be an entry drug for exploitation by the right.
I gotta hold my hand up and state that I am not read up on it at all, so happy to be corrected. But my impression is that Pope Leo XIII’s conception was to reduce secular power so as to leave a void for the church to fill. And it’s the potential exploitation of that void that attracts the far right too.
In Europe GDPR gives you the right to have all your data deleted. All you do is send in a request and SO has to remove everything of yours, not just anonymize it. There are some exceptions for legal reasons, eg where financial transactions are involved, but comments should not be exempt.
It is still just a “trust us” deal. They say they have deleted it, and all you can do is trust them. They could possibly get into legal troubles if it was shown they were lying, but that could be easily avoided as well.
GDPR is ok, but much of it is based on good actors doing what they should.
IANAL but I thought removing non-PII mostly boiled down to risk since gdpr has big teeth. With a lot of money on the table and a licence attached to post they may feel it’s worth pursuing. They’ve probably been setting up protections for this for a while.
They are also retained by anyone who has archived them., like OpenAI or Google. Thus making their AIs more valuable.
To really pull up the ladder, they will have to protest the Internet Archive and Common Crawl, too. It’s just typical right-wing bullshit; acting on emotion and against their own interests.
Why?? Please make this make sense. Having AI to help with coding is ideal and the greatest immediate use case probably. The web is an open resource. Why die on this stupid hill instead of advocating for a privacy argument that actually matters?
Edit: Okay got it. Hinder significant human progress because a company I don’t like might make some more money from something I said in public, which has been a thing literally forever. You guys really lack a lot of life skills about how the world really works huh?
Hating on everything AI is trendy nowdays. Most of these people can’t give you any coherent explanation for why. They just adopt the attitude of people around them who also don’t know why.
I believe the general reasoning is something along the lines of not wanting bad corporations to profit from their content for free. So it’s just a matter of principle for the most part. Perhaps we need to wait for someone to train LLM on the freely available to everyone data on Lemmy and then we can interview it to see what’s up.
Mega co operations like Microsoft, Google are evil. Very easy explanation. Even if it was a good open source company scraping the data to train ai models, people should be free to delete the datta they input. It’s pretty simple to understand.
Because none of the big companies listen to the privacy argument. Or any argument, really.
AI in itself is good, amazing, even.
I have no issue with open-source, ideally GPL- or similarly licensed AI models trained on Internet data.
But involuntarily participating in training closed-source corporate AI’s…no, thanks. That shit should go to the hellhole it was born in, and we should do our best to destroy it, not advocate for it.
If you care about the future of AI, OpenAI should long be on your enemy list. They expropriated an open model, they were hypocritical enough to keep “open” in the name, and then they essentially sold themselves to Microsoft. That’s not the AI future we should want.
Because being able to delete your data from social networks you no longer wish to participate in or that have banned you, as long as they specifically haven’t paid you for the your contributions, is a privacy argument that actually matters, regardless and independent of AI.
In regards to AI, the problem is not with AI in general but with proprietary for-profit AI getting trained with open resources, even those with underlying license agreements that prevent that information being monetized.
Were in a capitalist system and these are for-profit companies, right? What do you think their goal is. It isn’t to help you. It’s to increase profits. That will probably lead to massive amounts of jobs replaced with AI and we will get nothing for giving them the data to train on. It’s purely parasitic. You should not advocate for it.
If it’s open and not-for-profit, it can maybe do good, but there’s no way this will.
If they make it better that may increase profits temporarily, as they draw customers away from competitors. Once you don’t have any competitors then the only way to increase profits is to either decrease expenses or increase revenue. Increasing revenue is limited if you’re already sucking everything you can.
To us? No, it isn’t wrong. To them? Absolutely. You don’t becoming a billionaire by thinking you can have enough. You don’t dominate a market while thinking you don’t need more.
Meta and Google have done more for open source ai than anyone else, I think a lot of antis don’t really understand how computer science works so you imagine it’s like them collecting up physical iron and taking it into a secret room never to be seen again.
The actual tools and math is what’s important, research on best methods is complex and slow but so far all these developments are being written up in papers which anyone can use to learn from - if people on the left weren’t so performative and lazy we could have our own ai too
humanity progress is spending cities worth of electricity and water to ask copilot how to use a library and have it lie back to you in natural language? please make this make sense
Why do people roll coal? Why do vandalize electric car chargers? Why do people tie ropes across bike lanes?
Because a changing world is scary and people lash out at new things.
The coal rollers think they’re fighting a vallient fight against evil corporations too, they invested their effort into being a car guy and it doesn’t feel fair that things are changing so they want to hurt people benefitting from the new tech.
The deeper I get into this platform the more I realize the guise of being ‘progressive, left, privacy-conscious, tech inclined’ is literally the opposite.
The enshittification is very real and is spreading constantly. Companies will leech more from their employees and users until things start to break down. Acceleration is the only way.
I mean, sure but in the context of individual websites I don’t see it being a big deal. There will be replacements, and relatively quickly. Accelerationism applied to major societal structures is a terrible idea though.
Except it’s not like a plane because we can stop using specific websites whenever we like, and build our own websites to whittle away at their hegemony.
Maybe we should start asking questions that iterate loops billions of times. Something semi-malicious that a person would recognize but an AI wouldn’t.
Nah, the training data probably doesn’t quite work that way. The AI would be very unlikely to test code, just regurgitate the most likely response based on it’s training sets. Instead just filling posts with random bits and pieces of unrelated code and responses might be better.