And what about the authors whose works were injected without compensation? What should we do for them? I don’t think that these commercial AI models should get to infringe on their copyrights for nothing. If I pay for a ChatGPT subscription and ask it to tell me about the war the Middle East and it basically regurgitates and plagiarizes information it learned from a journalist, then ChatGPT has essentially stolen the copyrighted work from that journalist and the revenue that my click would have earned them.
I don’t see a problem using publicly posted copyrighted data for non-commercial use for training local language models but don’t think its fair to allow copyright infringement for commercial use.