I guess maybe people were shocked it was really “all of Bibliotik” because they couldn’t believe someone could actually manage to keep a decent share ratio on that fucking site to not get kicked off, especially while managing to download the whole corpus. /s (I don’t know this from personal experience or anything.)
In all seriousness, however, it’s been well known for a while now that these models were being trained on copyrighted books, and the companies trying to hide their faces over it are a joke.
It’s just like always, copyright is used to punish regular ass people, but when corporations trash copyright, its all “whoopsie doodles, can’t you just give us a cost-of-doing-business-fine and let us continue raping the public consciousness for a quick buck?” Corporations steal copyrighted material all the time, but regular ass people don’t have the money to fight it. Hiding behind Fair Use while they are using it to make a profit isn’t just a joke but a travesty and the ultimate in twisting language to corporate ends.
They may have bitten off more than they can chew here, though, possibly paving way for a class-action lawsuit from writers and publishers.