There have been multiple accounts created with the sole purpose of posting advertisement posts or replies containing unsolicited advertising.

Accounts which solely post advertisements, or persistently post them may be terminated.

linearchaos ,
@linearchaos@lemmy.world avatar

I tried to ingest a four terabyte epub library once. Even getting the data ingested with the author and title in the right spot was almost impossible. If a duplicates weren’t just slightly wrong would be a different story but the duplicates are often misspells or different spellings.

Realistically the best thing you can do is get an output of file name, title, author and hand dedupe, but even then you’re going to have to be careful about quality and language and all kinds of other strange issues you run into with large libraries.

In the end I gave up and only stored what I really wanted and would realistically ever need and that was small enough to hand cull.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • [email protected]
  • random
  • lifeLocal
  • goranko
  • All magazines