There have been multiple accounts created with the sole purpose of posting advertisement posts or replies containing unsolicited advertising.

Accounts which solely post advertisements, or persistently post them may be terminated.

LibreFish ,

Yes, because 1:1 duplication of copy written works violates copyright, but summaries of those works and relaying facts stated in those works is perfectly legal (by an ai or not).

unexpectedteapot ,

If you mean by “perfectly legal” a fair use claim, then could you please explain how a commercial for-profit company using the works, sometimes echoing verbatim results, is infringing on the copyrights in a fair use manner?

LibreFish ,

I do not mean a fair use claim. To quote the copyright office “Copyright does not protect facts, ideas, systems, or methods of operation, although it may protect the way these things are expressed” source

Facts and ideas cannot be copy written, so what I was specifically referring to is that if I or an AI read a paper about jellyfish being ocean creatures, then later talk about jellyfish being ocean creatures, there’s no restrictions on that whatsoever as long as we don’t reproduce the paper word by word.

Now, most of the time AI summarizes things or collects facts, and since those themselves cannot be protected by copyright it’s perfectly legal. On the occasion when AI spits out copy written work then that’s a gray area and liability if any will probably decided in the courts.

rivermonster ,

Kind of a strawman, I’d like everything to be FOSS, and if we keep Capitalism (which we shouldn’t), it should be HEAVILY regulated not the laissez-faire corporatocracy / oligarchy we have now.

I don’t want any for-profit capitalists to have any control of AI. It should all be owned by the public and all productive gains from it taxed at 100%. But open source AI models, right on.

And team SciHub–FUCK YEAH!

Mango ,

What’s scihub?

sndrtj ,

A website where you can download paywalled scientific literature. Most scientific literature is paywalled by publishers, and costs a real significant amount to read (like 30-50$ per article if you don’t have a subscription).

Scihub basically just pirates it. And has been shut down several times. But as most scientific studies are already laid with public money, scihub isn’t that unethical at all.

andros_rex ,

Lots of scientists will just send you their article if you email them. They don’t get the money when you pay to read it - often they pay to submit. Reviewing journal articles is a privilege and doesn’t get you paid. The prestige of a scientific article is from the number of times people have cited it. The only “harm” done is that the publisher doesn’t get to make 100% profit for doing nothing.

Journal publishing is mostly a way to extract money from universities. Elsevier and its ilk name whatever price they think a research university can afford.

Gargantu8 ,

Very true. Also, a new federal policy is now in place and requires any research funded even in part by federal money be open access. As a result we should see much more high quality research becoming open access (already has begun). Only downside is research labs like mine have to use more money to publish to these journals because open access costs more for the authors. Hopefully this system gets reformed during my lifetime.

But yes, please just email the authors! Works most of the time and I think it’s fun.

Tathas ,

Time to make OpenASci?

/rimshot

BleatingZombie ,

More people need to think like you. Why isn’t “Total War: Warhammer” just called “Total Warhammer”? These are the questions that keep me up at night

Tum ,

I agree with you, but also Total War is the trademark brand and they’re also gonna use it.

Carighan ,
@Carighan@lemmy.world avatar

Total War: Hammer!

thejodie ,

Stop, Hammer time!

“Go with the flow”, it is said

Jknaraa ,

And people wonder why there’s so much push back against everything corps/gov does these days. They do not act in a manner which encourages trust.

Tillman ,

Weird, why would OpenAI be illegal? Bizarre comp.

Poem_for_your_sprog ,

They steal data from everything including paywalled sources and proprietary data.

UnderpantsWeevil ,
@UnderpantsWeevil@lemmy.world avatar

Consider who sits on OpenAI’s board and owns all their equity.

SciHub’s big mistake was to fail to get someone like Sundar Pichai or Jamie Iannone with a billion-dollar stake in the company.

reverendsteveii ,

this is because the technocrats are allowed to steal from you, but when you steal from them what they’ve stolen from actual researchers that’s a problem

blazeknave ,

There are no technocrats. Just oligarchs, that titan newer industries. Same as the old boss. Don’t give them more credit than that. It’s evil capitalism. Lump them with bankers, not UX designers imho

mamotromico ,

In what context would a UI/UX be considered a texhnocrat? I’m very confused

blazeknave ,

You’re not confused, you’re getting the point. Musk has more in common with Jamie Diamond than the tech workers with which he’s lumped by industry.

It’s not a tech people/company problem. They’re just like accounts, they don’t own the enterprise.

CrayonRosary ,

Lemmy users: Copyright law is broken and stupid.

Also Lemmy users: A.I. violates copyright law!

creation7758 ,

Where’s the contradiction though

reverendsteveii ,

yes. there are myriad ways that copyright law is broken and stupid, but protecting the creations of independent artists isn’t one of them

take this bullshit back to reddit

Liz ,

I mean, consistency is better than inconsistency, even if we don’t agree with the rules.

UnderpantsWeevil ,
@UnderpantsWeevil@lemmy.world avatar

A.I. doesn’t violate copywrite laws. It is the data-mining done to train A.I. and the regurgitation of said data in the responses that ultimately violate these laws. A model trained on privately owned, properly licensed, or exclusively public works wouldn’t be a problem.

Even then, I would argue that lack of attribution is a bigger problem than merely violating copywrite. A big part of the LLM mystique is in how it can spit out a few lines of Shakespeare without accreditation and convince its users that its some kind of master poet.

Copywrite law is stupid and broken. But plagarism is a problem in its own right, as it seeks to effectively sell people their own creative commons at an absurd markup.

trafficnab , (edited )

A model trained on privately owned, properly licensed, or exclusively public works wouldn’t be a problem.

This is how we end up with only corpo owned AIs being allowed to exist imo, places like stock photo sites are the only ones with large enough repositories of images to train AI that they have all the legal rights to

The way I see it, either generative AI is legal, free for everyone to run locally, and the created works are public domain, OR, everyone pays $20/mo to massive faceless corpos for the rest of their lives to have the privilege of access to it because they’re the only ones who own all (or have enough money to license) the IP needed to train them

UnderpantsWeevil ,
@UnderpantsWeevil@lemmy.world avatar

This is how we end up with only corpo owned AIs being allowed to exist imo

Its how you end up with sixteen different streaming services that only vend a sliver of the total available content, sure. But the underlying technology of AI grows independent of what its trained on.

The way I see it, either generative AI is legal, free for everyone to run locally, and the created works are public domain, OR, everyone pays $20/mo to massive faceless corpos for the rest of their lives to have the privilege of access to it

There are other alternatives. These sites can be restricted to data within the public domain. And we can increase our investment in public media. The problem of NYT articles being digested and regurgitated as ChatGPT info-vomit isn’t a problem if the NYT is a publicly owned and operated enterprise. Then its not struggling to profit off journalism, but treating this information as a loss-leading public service open to all, with ChatGPT simply operating as a tool to store, process, and present the data.

Similarly, if you limit generative AI to the old Mickey Mouse and Winnie-the-Pooh films from the 1930s, you leave plenty of room for original artists to create new works without fear that their livelihoods get chews up and fed back into the system. If you invest in public art exhibitions then these artists can get paid to pursue their craft, the art becomes public domain immediately, and digital tools that want to riff on the original are free to do so without undermining the artists themselves.

Maggoty , (edited )

Oh OpenAI is just as illegal as SciHub. More so because they’re making money off of stolen IP. It’s just that the Oligarchs get to pick and choose. So of course they choose the arrangement that gives them more control over knowledge.

Lemminary ,

They’re not serving you the exact content they scraped, and that makes all the difference.

localhost443 ,

Well if you believe that you should look at the times lawsuit.

Word for word on hundreds/thousands of pages of stolen content, its damming

Lemminary ,

Why do you assume that I haven’t? The case hasn’t been resolved and it’s not clear how The NY Times did what they claim, which is may as well be manipulation. It’s a fair rebuttal by OpenAI. The Times haven’t provided the steps they used to achieve that.

So unless that’s cleared up, it’s not damming in the slightest. Not yet, anyway. And that still doesn’t invalidate my statement above, because it’s still under very specific circumstances when that happens.

Emy ,

Also intention is pretty important when determining the guilt of many crimes. OpenAI doesnt intentionally spit back an author’s exact words, their intention is to summarize and create unique content.

pm_me_your_titties ,

Ah, yes. The defense of “I didn’t mean to do it.” Always a classic.

whofearsthenight ,

I mean, I’m not sure why this conversation even needs to get this far. If I write an article about the history of Disney movies, and make it very clear the way I got all of those movies was to pirate them, this conversation is over pretty quick. OpenAI and most of the LLMs aren’t doing anything different. The Times isn’t Wikipedia, most of their stuff is behind a paywall with pretty clear terms of service and nothing entitles OpenAI to that content. OpenAI’s argument is “well, we’re pirating everything so it’s okay.” The output honestly seems irrelevant to me, they never should have had the content to begin with.

Lemminary ,

That’s not the claim that they’re making. They’re arguing that OpenAI retains their work they made publicly available, which OpenAI claims is fair use because it’s wholly transformative in the form of nodes, weights and biases, and that they don’t store those articles in a database for reuse. But their other argument is that they created a system that threatens their business which is just ludicrous.

Lemminary ,

No, the real defense is “that’s not how LLMs work” but you are all hinging on the wrong idea. If you so think that an LLM is capable of doing what you claim, I’d love to hear the mechanism in detail and the steps to replicate it.

UNWILLING_PARTICIPANT ,

So it’s content laundering

Lemminary ,

What a colorful mischaracterization. It sounds clever at face value but it’s really naive. If anything about this is deceptive, it’s the lengths that people go to to slander what they dislike.

jacksilver ,

Actually content laundering is the best term I’ve heard to describe the process. Just like money laundering, you no longer know the source and know it’s technically legal to use and distribute.

I mean, if the copyrighted content wasn’t so critical, they would train models without it. Their essentially derivative works, but no one wants to acknowledge it because it would either require changing our copyright laws or make this potentially lucrative and important work illegal.

Lemminary ,

Content laundering is not a good way to describe it because it’s misleading as it oversimplifies and mischaracterizes what a language model actually does. It’s a fundamental misunderstanding of how it works. Training language models is typically a transparent and well-documented process as described by the mountains of research over the past decades. The real value comes from the weights of the nodes in the neural network and not the source that it spits out in its entirety when it was trained. The source material is evaluated and wholly transformed into new data in the form of nodes and weights. The original content does not exist as it was within the network because there’s no way to encode it that way. It’s a statistical system that compounds information.

And while LLMs do have the capacity to create derivative works in other ways, it’s not all that they do, or what they always do. It’s only one of the many functions that it has. What you say would probably be true if it was only trained on a single source, but that’s not even feasible. But when you train it on millions of sources, what remains are the overall patterns of language within those works. It’s much more sophisticated and flexible than what you describe.

So no, if it was cut and dry there would be grounds for a legitimate lawsuit. The problem is that people are arguing points that do not apply but sound reasonable when they haven’t seen a neural network work under the hood. If anything, new laws need to be created to address what LLMs do if you’re so concerned about proper compensation.

jacksilver ,

I am familiar with how LLMs work and are trained. I’ve been using transformers for years.

The core question I’d ask is, if the copyrighted material isn’t essential to the model, why don’t they just train the models without that data? If it is core to the model, then can you really say they aren’t derivative of that content?

I’m not saying that the models don’t do something more, just that the more is built upon copyrighted material. In any other commercial situation, you’d have to license/get approval for the underlying content if you were packaging it up. When sampling music, for example, the output will differ greatly from the original song, but because you are building off someone else’s work you must compensate them.

Its why content laundering is a great term. The models intermix so much data that it’s hard to know if the content originated from copyrighted materials. Just like how money laundering is trying to make it difficult to determine if the money comes from illicit sources.

Jilanico ,
@Jilanico@lemmy.world avatar

I feel most people critical of AI don’t know how a neural network works…

Lemminary ,

That is exactly what’s going on here. Or they hate it enough that they don’t mind making stuff up or mischaracterizing what it does. Seems to be a common thread on the Fediverse. It’s not the first time this week I’ve seen it.

Cethin ,

It’s great how for most of us we’re taught that just changing the order of words is still plagerism. For them they frequently end up using the exact same words as other things and people still argue it somehow is intelligent and somehow not plagerism.

Lemminary ,

“Changing the order of words” is what it does? That’s news to me. And do you have examples of it “using the exact same words as other things” without prompt manipulation?

asret ,

Why does the prompting matter? If I “prompt” a band to play copyrighted music does that mean they get a free pass?

Lemminary ,

That’s not a very good analogy because the band would be reproducing an entire work of art which an LLM does not and cannot. And by prompt manipulation I mean purposely making it seem like the LLM is doing something it wouldn’t do on its own. The operating word is seem, which is what I meant by manipulation. The prompting here is irrelevant, but how it’s done is. So unless The Times releases the steps they used to get ChatGPT to output what it did, you can’t really claim that that’s what it does.

In a blog post, OpenAI said the Times “is not telling the full story.” It took particular issue with claims that its ChatGPT AI tool reproduced Times stories verbatim, arguing that the Times had manipulated prompts to include regurgitated excerpts of articles. “Even when using such prompts, our models don’t typically behave the way The New York Times insinuates, which suggests they either instructed the model to regurgitate or cherry-picked their examples from many attempts,” OpenAI said.

stewsters ,

If you passed them a sheet of music I’d say that’s on you, it would be your responsibility to not sell recordings of them playing it.

Just like if I typed the first chapter of Harry Potter into word it is not Microsoft’s intent to breach copyright, it would have been my intent to make it do it. It would be my responsibility not to sell that first chapter, and they should come after me if I did, even though MS is a corporation who supplied the tools.

afraid_of_zombies ,

What do you expect when people support 90 year copyrights after death?

LWD , (edited )

deleted_by_author

  • Loading...
  • Gutless2615 ,

    That’s a pretty strong accusation. You seem to like to wade through people’s post history but to my cursory glance nothing would indicate this poster is a troll.

    You understand AI posts frequently surface on this platform and people will engage with those posts even if they disagree with you?

    LWD , (edited )

    deleted_by_author

  • Loading...
  • Gutless2615 , (edited )

    Yeah you keep spamming that screen shot. Idk I’m not seeing it. I read the thread you’re posting and it seems like you’re just digging in and insisting that someone that disagrees with you must be a troll.

    For what it’s worth, you made the same accusation against me yesterday and after I think I pretty effectively (and unnecessarily I might add) defended myself you deleted those posts. Making spurious accusations like that (and, as I read it, this) are also trollish behavior that doesn’t further any discussion. I’ve looked in your thread you’re posting. You come out flying with accusations based on extremely flimsy evidence. I think OPs responses seemed entirely warranted.

    LWD , (edited )

    deleted_by_author

  • Loading...
  • Gutless2615 ,

    No, see, it actually isn’t self evident. After being accused of being disingenuous because he only talked about open source in the context of AI — again almost the verbatim ridiculous accusation you lobbed at me before cowardly deleting it - he asked for a citation relevant to the issue and someone sent a CNN article about Duolingo laying off staff. That isn’t the gotcha you think it is. It doesn’t “destroy my reputation” lmao to point out that you are, in fact, acting like a troll. This is a pattern of yours. Be better.

    LWD , (edited )

    deleted_by_author

  • Loading...
  • Gutless2615 ,

    My client? What are you on about? And no, I didn’t delete an insult, I realized I responded to the wrong post, to yet another person you’re accusing of conspiratorially disagreeing with you. Thanks for giving me a reason to repost it: “To be clear, then, you’re back to claiming I’m, what, an astroturf plant? It seems like I wasted my time engaging with you after all. Continue trolling.” Troll.

    hglman ,

    They are also right.

    uriel238 ,
    @uriel238@lemmy.blahaj.zone avatar

    The IP system, which goes to great lengths to block things like open-access scientific publications, is borked borked borked borked borked.

    If OpenAI and other generative AI projects are the means by which we finally break it so we can have culture and a public domain again, well, we had to nail Capone with tax evasion.

    Yes, industrialists want to use AI [exactly they way they want to use every other idea – plausible or not] to automate more of their industries so they can pay fewer people less money for more productivity. And this is a problem of which generative AI figures centrally, but it’s not really all that new, and eventually we’re going to have to force our society to recognize that it works for the public and not money. I don’t think AI is going to break the system and lead us to communist revolution ( The owning class will tremble…! ) But eventually it will be 1789 all over again. Or we’ll crush the fash and realize the only way we can get the fash to not come back is by restoring and extending FDR’s new deal.

    I am skeptical the latter can happen without piles of elite heads and rivers of politician blood.

    JoeKrogan ,
    @JoeKrogan@lemmy.world avatar

    Thats actually not a bad idea, train a model with all the data in scihub a then release the model to the public

    worldsayshi ,

    Actually a good point.

    Chocrates ,

    That would likely be explicitly illegal if the NYT case succeeds (and it isn’t fraud as OpenAI alleges)

    Liz ,

    It would still be useless, the thing would just produce bullshit papers.

    Amir ,
    @Amir@lemmy.ml avatar

    This is just the most inefficient zip file ever created

    Maggoty ,

    We need to ban the publishing business from academic stuff. Have the Universities host a site that’s free access. They can also better run the peer review system and the journals would also also no longer control what research sees the light of day even behind a paywall.

    Liz ,

    How would you publish if you’re not a part of a major research institution? Los Alamos National Lab could host its own papers just fine, but what about small-time labs? I know of at least one person who doesn’t even officially work in science but publishes original research they do in their free time.

    The journal system still provides a service, even if they over-charge for access. The peer review system has value. Imagine if there was zero barrier to publish. As a reader, you’d have to wade through piles of trash to find decent science.

    Where would you find it all? Currently we use journal aggregators, whose service also has value and costs money. Are you really going to go to every university’s website looking for research relevant to your area? We could do that again, but with everyone responsibile for publishing their own work, well, who gets indexed with the aggregators?

    Maggoty ,

    You get published with a university instead of a for profit publishing system. And universities would get a good or bad reputation for their peer review, just like journals. The aggregator could easily be run by a coalition of universities with government grants to make the maintenance and upkeep free to the users and universities.

    We do not have to lock research behind paywalls.

    CallumWells ,

    Like doaj.org

    cecinestpasunbot ,

    The problem isn’t just publishing though, it’s academia as well. Scientists are incentivized to publish in “prestigious” closed access journals such as Nature. They are led to believe it’s better for their career than publishing in open access journals such as PLOS One. As such, groundbreaking papers often get paywalled. Universities then feel obligated to pay outrageous subscription fees to access them.

    Maggoty ,

    Yup. And this would change that, giving universities more influence.

    FlyingSquid ,
    @FlyingSquid@lemmy.world avatar

    Yeah, but did SciHub pay Nigerians a pittance to look at and read about child rape? Because- wait, I have no idea what I’m even arguing. Fuck OpenAI though.

    greywolf0x1 ,

    OpenAI did those subhuman training of ChatGPT in Kenya, not Nigeria. And since the Kenyan govt is a western lapdog these days, nothing would ever come out of that.

    FlyingSquid ,
    @FlyingSquid@lemmy.world avatar

    Oh, well that makes it okay then. My mistake.

    hottari ,

    This is different. AI as a transformative tech is going to usher the US economy into the next boom of prosperity. The AI revolution will change the world and allow people to decide if they want to work for money or not (read UBI). In case you haven’t caught on, am being sarcastic.

    All this despite ChatGPT being a total complete joke.

    TurtleJoe ,
    @TurtleJoe@lemmy.world avatar

    This was a case where you needed the sarcasm tag. Up to then, it was a totally “reasonable” comment from an AI bro.

    BTW, plug “crypto” in to your comment for AI, and it’s a totally normal statement from 2020/21. It’s such a similar VC grift.

    douglasg14b ,
    @douglasg14b@lemmy.world avatar

    Honestly couldn’t tell if you were being sarcastic or not because Poes law until I saw your note.

    If all the wealth created by these sorts of things didn’t funnel up to the 0.01% then yeah. It could usher in economic changes that help bring about greater prosperity in the same way mechanical automation should have.

    Unfortunately it’s just going to be another vector for more wealth to be removed from your average American and transferred to a corporation

    SparrowRanjitScaur ,

    Can you elaborate on the specific ways that chatgpt is a joke?

    Wiz ,

    Ask Bing!

    yuki2501 ,
    @yuki2501@lemmy.world avatar

    youtu.be/ro130m-f_yk

    Adam explains it. Enjoy.

    SparrowRanjitScaur ,

    Ah yes, of course. I remember this video. Not all of the specific points, but I do remember Adam Conover really chewing into large language models. Interestingly, that same Adam Conover must have believed AI isn’t actually that useless seeing as he became a leading member of the 2023 Hollywood writers strike, in which AI was a central focus:

    Writers also wanted artificial intelligence, such as ChatGPT, to be used only as a tool that can help with research or facilitate script ideas and not as a tool to replace them.

    en.wikipedia.org/…/2023_Writers_Guild_of_America_…

    That said, I’m not going to rewatch a 25 minute video for a discussion on lemmy. Any specific points you want to make against chat gpt?

    wikibot Bot ,

    Here’s the summary for the wikipedia article you mentioned in your comment:

    From May 2 to September 27, 2023, the Writers Guild of America (WGA)—representing 11,500 screenwriters—went on strike over a labor dispute with the Alliance of Motion Picture and Television Producers (AMPTP). With a duration of 148 days, the strike is tied with the 1960 strike as the second longest labor stoppage that the WGA has performed, only behind the 1988 strike (153 days). Alongside the 2023 SAG-AFTRA strike, which continued until November, it was part of a series of broader Hollywood labor disputes. Both strikes contributed to the biggest interruption to the American film and television industries since the COVID-19 pandemic. The lack of ongoing film and television productions resulted in some studios having to close doors or reduce staff. The strike also jeopardized long-term contracts created during the media streaming boom: big studios could terminate production deals with writers through force majeure clauses after 90 days, saving them millions of dollars. In addition, numerous other areas within the global entertainment ecosystem were impacted by the strike action, including the VFX industry and prop making studios. Following a tentative agreement, union leadership voted to end the strike on September 27, 2023. On October 9, the WGA membership officially ratified the contract with 99% of WGA members voting in favor of it. Its combined impact with the 2023 SAG-AFTRA strike resulted in the loss of 45,000 jobs, and “an estimated $6.5 billion” loss to the economy of Southern California.

    ^article^ ^|^ ^about^

    joe_cool ,

    So, I feel taking an .epub and putting it in a .zip is pretty transformative.

    Also you can make ChatGPT (or Copilot) print out quotes with a bit of effort, now that it has Internet.

    UnderpantsWeevil ,
    @UnderpantsWeevil@lemmy.world avatar

    In case you haven’t caught on, am being sarcastic.

    It sounds like a completely sincere Marc Andressen post to me.

    KingThrillgore ,
    @KingThrillgore@lemmy.ml avatar

    man this cyberpunk present fucking sucks

    danielbln ,

    Cyberpunk would always suck, it’s dystopia. Always has been.

    skulblaka ,
    @skulblaka@startrek.website avatar

    Yeah but we got all the dys without any of the topia. I was promised high quality prosthetics, neon blinkenlights, and the right to bear arms. We’ve got like 15% of the appropriate level of any of those.

    adrian783 ,

    no, you get the dys, rich people gets the topia. you don’t think you’re the protagonist do you?

    BURN ,

    Cyberpunk 2077 had a whole giant plot point that the old net was overtaken by rough AIs and the AI wars were a thing.

    I’m not sure they’re that far off base

  • All
  • Subscribed
  • Moderated
  • Favorites
  • [email protected]
  • random
  • lifeLocal
  • goranko
  • All magazines