There have been multiple accounts created with the sole purpose of posting advertisement posts or replies containing unsolicited advertising.

Accounts which solely post advertisements, or persistently post them may be terminated.

theterrasque ,

It’s less the calculations and more about memory bandwidth. To generate a token you need to go through all the model data, and that’s usually many many gigabytes. So the time it takes to read through in memory is usually longer than the compute time. GPUs have gb’s of RAM that’s many times faster than the CPU’s ram, which is the main reason it’s faster for llm’s.

Most tpu’s don’t have much ram, and especially cheap ones.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • [email protected]
  • random
  • lifeLocal
  • goranko
  • All magazines