I despise this use of mod power in response to a protest. It’s our content to be sabotaged if we want - if Stack Overlords disagree then to hell with them.
I’ll add Stack Overflow to my personal ban list, just below Reddit.
Once submitted to stack overflow/Reddit/literally every platform, it’s no longer your content. It sucks, but you’ve implicitly agreed to it when creating your account.
While true, it’s stupid that things are that way. They shouldn’t be able to hide behind the idea that “we’re not responsible for what our users publish, we’re more like a public forum” while also having total ownership over that content.
you’ve implicitly agreed to it when creating your account
Many people would agree with that, probably most laws do. However I doubt many users have actually bothered to read the unnecessarily long document, fewer have understood the legalese, and the terms have likely already been changed pray I don’t alter it any further. That’s a low and shady bar of consent. It indeed sucks and I think people should leave those platforms, but I’m also open to laws that would invalidate that part of the EULA.
It’s true that it’s mostly a symbolic act, but the rebellion matters m, especially from old accounts. It’s also a nice way to mark the time after which I never participated in SO again. After my ban expires, I’ll deface my questions again. And again. Until they permaban me.
There’s also the possibility of adding to the wonderful irony of making the AI more useful than the original by having content that’s no longer accessible through through the original. It doesn’t get more enshittified than that, even if Prashanth Chandrasekar is too out of touch to ever regret his decision.
I think you’re 100% correct in assuming they’ve already fed it data scraped from SO. I’ve previously gotten code samples from ChatGPT that was clearly from SO down to the comments in the code. Even reverse searched some of the code and found the question it was from.
They seem to only be watching the questions right now. You’re automatically prevented from deleting an accepted answer, but if you answered your own question (maybe because SO was useless for certain niche questions a decade ago so you kept digging and found your own solution), you can unaccept your answer first and then delete it.
I got a 30 day ban for “defacing” a few of my 10+ year old questions after moderators promptly reverted the edits. But they seem to have missed where I unaccepted and deleted my answers, even as they hang out in an undeletable state (showing up red for me and hidden for others).
And comments, which are a key part to properly understanding a lot of almost-correct answers, don’t seem to be afforded revision history or to have deletes noticed by moderators.
So it seems like you can still delete a bunch of your content, just not the questions. Do with that what you will.
Messages that people post on Stack Exchange sites are literally licensed CC-BY-SA, the whole point of which is to enable them to be shared and used by anyone for any purpose. One of the purposes of such a license is to make sure knowledge is preserved by allowing everyone to make and share copies.
It does help to know what those funny letters mean. Now we wait for regulators to catch up…
/tangent
If anything, we’re a very long way from anything close to intelligent, OpenAI (and subsequently MS, being publicly traded) sold investors on the pretense that LLMs are close to being “AGI” and now more and more data is necessary to achieving that.
If you know the internet, you know there’s a lot of garbage. I for one can’t wait for garbage-in garbage-out to start taking its toll.
Also I’m surprised how well open source models have shaped up, its certainly worth a look. I occasionally use a local model for “brainstorming” in the loosest terms, as I generally know what I’m expecting, but it’s sometimes helpful to read tasks laid out. Also comfort in that nothing even need leave my network, and even in a pinch I got some answers when my network was offline.
It gives a little hope while corps get to blatantly violate copyright while having wielding it so heavily, that advancements have been so great in open source.
That license would require chatgpt to provide attribution every time it used training data of anyone there and also would require every output using that training data to be placed under the same license. This would actually legally prevent anything chatgpt created even in part using this training data from being closed source. Assuming they obviously aren’t planning on doing that this is massively shitting on the concept of licensing.
CC attribution doesn’t require you to necessarily have the credits immediately with the content, but it would result in one of the world’s longest web pages as it would need to have the name of the poster and a link to every single comment they used as training data, and stack overflow has roughly 60 million questions and answers combined.
appropriate credit — If supplied, you must provide the name of the creator and attribution parties, a copyright notice, a license notice, a disclaimer notice, and a link to the material. CC licenses prior to Version 4.0 also require you to provide the title of the material if supplied, and may have other slight differences.
Maybe that could be just a link to the user page, but otherwise I would see it as needing to link to each message or comment they used.
Maybe but I don’t think that is well tested legally yet. For instance, I’ve learned things from there, but when I share some knowledge I don’t attribute it to all the underlying sources of my knowledge. If, on the other hand, I shared a quote or copypasta from there I’d be compelled to do so I suppose.
I’m just not sure how neural networks will be treated in this regard. I assume they’ll conveniently claim that they can’t tie answers directly to underpinning training data.
Ethically and logically it seems like output based on training data is clearly derivative work. Legally I suspect AI will continue to be the new powerful tool that enables corporations to shit on and exploit the works of countless people.
The problem is the legal system and thus IP law enforcement is very biased towards very large corporations. Until that changes corporations will continue, as they already were, exploiting.
They are not. A derivative would be a translation, or theater play, nowadays, a game, or movie. Even stuff set in the same universe.
Expanding the meaning of “derivative” so massively would mean that pretty much any piece of code ever written is a derivative of technical documentation and even textbooks.
So far, judges simply throw out these theories, without even debating them in court. Society would have to move a lot further to the right, still, before these ideas become realistic.
Stack Overflow was great when it appeared. The info was spread out incredibly wide and there was a lot of really shitty info as well. One place where it could accumulate and be rated was extremely helpful.
But maybe it’s time to create a federated open source Stack Overflow.
I once managed to find a pretty good alternative, but then I forgot its name. It was a very chill community unlike what Stackoverflow was recently with it’s toxicity (properly formatted question police, people being offended for less popular languages, etc.).
Well, it is important to comply with the terms of service established by the website. It is highly recommended to familiarize oneself with the legally binding documents of the platform, including the Terms of Service (Section 2.1), User Agreement (Section 4.2), and Community Guidelines (Section 3.1), which explicitly outline the obligations and restrictions imposed upon users. By refraining from engaging in activities explicitly prohibited within these sections, you will be better positioned to maintain compliance with the platform’s rules and regulations and not receive email bans in the future.
Tough to say. I honestly don’t know. The user name is the classic word_wordNumber that bots use. The comments are long though. But its comments are spaced far apart timewise.
Comments are clearly ChatGPT I know because I did it once to troll some sub too. I instantly recognize the pirate ‚swashbuckling’ comment in their profile history you get when you type ‚write a funny comment like a Redditor’
The account reads like they’re pasting AI-generated responses to everything. Maybe it’s someone’s experiment. The prompt must include “You are a self-righteous asshole.”
I will answer some questions with my old account using gpt 4 to poison the data.
If you want to poison SO a little at the same time providing valid answers that help users, use outlook.com email domain for new accounts. It seems to not have anti throwaway countermeasures while being accepted by SO. And it seems fitting to bash the corporate with the corporate.
Maybe a better act of rebellion would be to scrape the data on stack, self host it, and move to an open source platform. Easy for me to say though, when I only ever coded Hello World
Anyone care to explain why people would care that they posted to a public forum that they don’t own, with content that is now further being shared for public benefit?
The argument that it’s your content becomes false as soon as you shared it with the world.
It’s not shared for public benefit, though. OpenAI, despite the Open in their name, charges for access to their models. You either pay with money or (meta)data, depending on the model.
Legally, sure. You signed away your rights to your answers when you joined the forum. Morally, though?
People are pissed that SO, that was actively encouraging Mods to use AI detection software to prevent any LLM usage in the posted questions and answers, are now selling the publicly accessible data, made by their users for free, to a closed-source for-profit entity that refuses to open itself up.
I can only really speak to reddit, but I think this applies to all of the user generated content websites. The original premise, that everyone agreed to, was the site provides a space and some tools and users provide content to fill it. As information gets added, it becomes a valuable resource for everyone. Ads and other revenue streams become a necessary evil in all this, but overall directly support the core use case.
Now that content is being packaged into large language models to be either put behind a paywall or packed into other non-freely available services. Since they no longer seem interested in supporting the model we all agreed on, I see no reason to continue adding value and since they provided tools to remove content I may as well use them.
But from the very beginning years ago, it was understood that when you post on these types of sites, the data is not yours, or at least you give them license to use it how they see fit. So for years people accepted that, but are now whining because they aren’t getting paid for something they gave away.
This is legal vs rude. It certainly is legal and was in the terms of service for them to use the data in any way they see fit. But, also it’s rude to bait and switch from being a message board to being an AI data source company. Users we led to believe they were entering into an agreement with one type of company and are now in an agreement with a totally different one.
You can smugly tell people they shouldn’t have made that decision 15 years ago when they started, but a little empathy is also cool.
Additionally: When you owe your entire existence and value to user goodwill it might not be a great idea to be rude to them.
No, you can’t post something in public and have it appropriated by a mega corp for money and then prevent you from deleting or modifying the very things you posted.
It should stay for creative works but that’s it. It should protect people who actually write books, compose music, make art, and sing. It shouldn’t be held by corporations forever by leeching off their workers.
Creative works of individuals specially… Corporations should explicitly be deemed not people and not possessing of the same rights as people and the fact that needs to be said just goes to show how far down the shit hole we’ve fallen
Corporations should be outlawed from owning houses and land as well. Maybe they can own the building, but they must be forced to rent the land from Us.