There have been multiple accounts created with the sole purpose of posting advertisement posts or replies containing unsolicited advertising.

Accounts which solely post advertisements, or persistently post them may be terminated.

Flaky ,
@Flaky@iusearchlinux.fyi avatar

FWIW, Common Crawl - a free/open-source dataset of crawled internet pages - was used by OpenAI for GPT-2 and GPT-3 as well as EleutherAI’s GPT-NeoX. Maybe on GPT3.5/ChatGPT as well but they’ve been hush about that.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • [email protected]
  • random
  • lifeLocal
  • goranko
  • All magazines