There have been multiple accounts created with the sole purpose of posting advertisement posts or replies containing unsolicited advertising.

Accounts which solely post advertisements, or persistently post them may be terminated.

hedgehog

@[email protected]

This profile is from a federated server and may be incomplete. Browse more on the original instance.

hedgehog ,

They aren’t. From a comment on https://www.reddit.com/r/ublock/comments/32mos6/ublock_vs_ublock_origin/ by u/tehdang:

For people who have stumbled into this thread while googling “ublock vs origin”. Take a look at this link:

tuxdiary.com/2015/06/14/ublock-origin/

"Chris AlJoudi [current owner of uBlock] is under fire on Reddit due to several actions in recent past:

  • In a Wikipedia edit for uBlock, Chris removed all credits to Raymond [Hill, original author and owner of uBlock Origin] and added his name without any mention of the original author’s contribution.
  • Chris pledged a donation with overblown details on expenses like $25 per week for web hosting.
  • The activities of Chris since he took over the project are more business and advertisement oriented than development driven."

So I would recommend that you go with uBlock Origin and not uBlock. I hope this helps!

Edit: Also got this bit of information from here:

https://www.reddit.com/r/chrome/comments/32ory7/ublock_is_back_under_a_new_name/

TL;DR:

  • gorhill [Raymond Hill] got tired of dozens of “my facebook isnt working plz help” issues.
  • he handed the repository to chrismatic [Chris Aljioudi] while maintaining control of the extension in the Chrome webstore (by forking chrismatic’s version back to himself).
  • chrismatic promptly added donate buttons and a “made with love by Chris” note.
  • gorhill took exception to this and asked chrismatic to change the name so people didn’t confuse uBlock (the original, now called uBlock Origin) and uBlock (chrismatic’s version).
  • Google took down gorhill’s extension. Apparently this was because of the naming issue (since technically chrismatic has control of the repo).
  • gorhill renamed and rebranded his version of ublock to uBlock Origin.
hedgehog ,

It sounds like you want these files to be encrypted.

Someone already suggested encrypting them with GPG, but maybe you want the files themselves to also be isolated, even while their data is encrypted. In that case, consider an encrypted volume. I assume you’re familiar with LUKS - you can encrypt a partition with a different password and disable auto-mount pretty easily. But if you’d rather use a file-based volume, then check out VeraCrypt - it’s a FOSS-ish [1], cross-platform tool that provides this capability. The official documentation is very Windows-focused - the ArchLinux wiki article is a pretty useful Linux focused alternative.

Normal operation is that you use a file to store the volume, which can be “dynamic” with a max size or can be statically sized (you can also directly encrypt a disk partition, but you could do that with LUKS, too). Then, before you can access the files - read or write - you have to enter the password, supply the encryption key, etc., in order to unlock it.

Someone without the password but with permission to modify the file will be capable of corrupting it (which would prevent you from accessing every protected file), but unless they somehow got access to the password they wouldn’t be able to view or modify the protected files.

The big advantage over LUKS is ease of creating/mounting file-based volumes and portability. If you’re concerned about another user deleting your encrypted volume, then you can easily back it up without decrypting it. You can easily load and access it on other systems, too - there are official, stable apps on Windows and Mac, though you’ll need admin access to run them. On Android and iOS options are a bit more slim - EDS on Android and Disk Decipher on iOS. If you’re copying a volume to a Linux system without VeraCrypt installed, you’ll likely still be able to mount it, as dm-crypt has support for VeraCrypt volumes.

  • 1 - It’s based on TrueCrypt, which has some less free restrictions, e.g., c. Phrase “Based on TrueCrypt, freely available at http://www.truecrypt.org/” must be displayed by Your Product (if technically feasible) and contained in its documentation.”
hedgehog ,

Is it possible to force a corruption if a disk clone is attempted?

Anything that corrupts a single file would work. You could certainly change your own disk cloning binaries to include such functionality, but if someone were accessing your data directly via their own OS, that wouldn’t be effective. I don’t know of a way to circumvent that last part other than ensuring that the data isn’t left on disk when you’re done. For example, you could use a ramdisk instead of non-volatile storage. You could delete or intentionally corrupt the volume when you unmount it. You could split the file, storing half on your USB flash drive and keeping the other half on your PC. You could XOR the file with contents of another file (e.g., one on your USB flash drive instead of on your PC) and then XOR it again when you need to access it.

What sort of attack are you trying to protect from here?

If the goal is plausible deniability, then it’s worth noting that VeraCrypt volumes aren’t identifiable as distinct from random data. So if you have a valid reason for having a big block of random data on disk, you could say that’s what the file was. Random files are useful because they are not compressible. For example, you could be using those files to test: network/storage media performance or compression/hash/backup&restore/encrypt&decrypt functions. You could be using them to have a repeatable set of random values to use in a program (like using a seed, but without necessarily being limited to using a PRNG to generate the sequence).

If that’s not sufficient, you should look into hidden volumes. The idea is that you take a regular encrypted volume, whose free space, on disk, looks just like random data, you store your hidden volume within the free space. The hidden volume gets its own password. Then, you can mount the volume using the first password and get visibility into a “decoy” set of files or use the second password to view your “hidden” files. Note that when mounting it to view the decoy files, any write operations will have a chance of corrupting the hidden files. However, you can supply both passwords to mount it in a protected mode, allowing you to change the decoy files and avoid corrupting the hidden ones.

hedgehog ,

Say I go to a furniture store and buy a table. It has a 5 year warranty. 2 years later, it breaks, so I call Ubersoft and ask them to honor the warranty and fix it. If they don’t, then I can file a suit against them, i.e., for breach of contract. I may not even have to file a suit, as there may be government agencies who receive and act on these complaints, like my local consumer protection division.

I’m talking about real things here. Your example is a situation where the US government agrees that a company shouldn’t be permitted to take my money and then renege on their promises. And that’s generally true of most governments.

Supposing an absence of regulations protecting consumers like me, like you’re trying to suggest in your example, then it would be reasonable to assume an absence of laws and regulations protecting the corporation from consumers like me. Absent such laws, a consumer would be free to take matters into their own hands. They could go back to Ubersoft and take a replacement table without their agreement - it wouldn’t be “stealing” because it wouldn’t be illegal. If Ubersoft were closed, the consumer could break in. If Ubersoft security tried to stop them, the consumer could retaliate - damaging Ubersoft’s property, physically attacking the owner / management / employees, etc… Ubersoft could retaliate as well, of course - nothing’s stopping them. And as a corporation, they certainly have more power than a random consumer - but at that point they would need to employ their own security forces rather than relying on the government for them.

Even if we kept laws prohibiting physical violence, the consumer is still regulated by things like copyright and IP protections, e.g., the anti-circumvention portion of the DMCA. Absent such regulations, a consumer whose software was rendered unusable or changed in a way they didn’t like could reverse engineer it, bypass DRM, host their own servers, etc… Given that you didn’t speak against those regulations, I can only infer that you are not opposed to them.

Why do you think we don’t need regulations protecting consumers but that we do need regulations restricting them?

hedgehog ,

“Supposed to” according to what?

If you’re in the US, Federal labor laws explicitly allow “meal periods” to not be paid, though short breaks must be paid. Neither is required to be offered to employees, though.

Source: www.dol.gov/general/topic/workhours/breaks

State laws differ, of course, and many states - e.g., California - have much more employee-friendly laws. However, even in CA, a meal period must be offered but isn’t required to be paid (unless it’s an on-duty meal break).

hedgehog ,

it’s still not a profitable venture

Source? My understanding is that Google doesn’t publish Youtube’s expenses directly but that Youtube has been responsible for 10% of Google’s revenue for the past few years (on the order of $31.5 Billion in 2023) and that it’s more likely than not profitable when looked at in isolation.

hedgehog ,

The fact remains, that Steam is preventing games from being listed for less on Epic.

For that fact to “remain,” it would need to have been established in the first place. At best it’s been alleged.

hedgehog ,

Tons of laptops with top notch specs for 1/2 the price of a M1/2 out there

The 13” M1 MacBook Air is $700 new from Best Buy. Better specced versions are available for $600 used (buy-it-now) on eBay, but the base specced version is available for $500 or less.

What $300 or less used / $350 new laptops are you recommending here?

Americans, how do you feel about being stored in a database by government agencies like the NSA?

Every search you make, email you send, text message, voice chat, location, and most likely the conversations you have in your own home are monitored and stored in a database for whoever knows how long (probably forever). When I hear land of the free, I immediately think bullshit. We are slowly losing our freedoms, what can we do...

hedgehog ,

theoretically they can

Is this a purely theoretical capability or is there actually evidence they have this capability?

it’s already been proven that they can tap into anyone’s phone

Listening into a conversation that you’re intentionally relaying across public infrastructure and gaining access to the phone itself are two very different things.

The use of proprietary software in literally everything

  1. Speak for yourself. And let’s be real, if you’re on Lemmy you’re 10 times more likely to be running Linux.
  2. Proprietary != closed source
  3. Do you really think that just because something is closed source means that it can’t be analyzed?

the amount of exploits the NSA has on hand

How many zero-day exploits does the NSA have? How many can be deployed remotely and without a nontrivial action by a user?

what’s stopping the NSA from spying this much?

Scale, capacity, cost, number of employees

—-

I’m not saying we shouldn’t oppose government surveillance. We absolutely should. But like another commenter pointed out, I’m much more concerned with the amount of data that corporations collect and have.

hedgehog ,

The article also says

The first point is one we’ve heard repeated many times before, but there’s never been any proof on it. Which perhaps the Wolfire lawsuit and this may actually bring to light. An accusation doesn’t necessarily mean they’re right though. Something people get confused on often is Steam Keys, which are completely separate to Steam Store purchases.

Saying “Don’t sell Steam keys off-platform for more than X% less than the game is priced for on Steam” and “Don’t sell your game elsewhere for more than X% less than the game is priced for on Steam“ are very different things. Steam openly does the former; I’ve never heard a reputable report of them doing the latter. The Wolfire lawsuit is explicitly about the former practice, for example.

The press release for this lawsuit reads like it’s about the latter, but I suspect that’s solely for optics. I reviewed the website dedicated to the lawsuit (steamyouoweus.co.uk) and thought they might have some more concrete evidence - nope, nothing. Under the first question in FAQs they have a link to their key documents, but the documents are “coming soon.”

Until they actually substantiate their claim, this lawsuit is just noise.

hedgehog ,

Thank you! That document is exactly the sort of thing I was looking for. Just realized (after writing most of this comment) that it’s for Wolfire and not Vicki Shotbolt’s case, but the commentary’s still relevant, I think.

There’s enough there that they may have a legitimate case, but there’s also a lot that is, as far as I know, completely acceptable for Valve to do. The specific items you listed, as well as a couple before / after them, are the most promising, IMO, but even so, there are a couple different counter-arguments that I could see Valve making.

The first counter-argument would be that the comments in 204-205 were in the context of publishers who had already received Steam keys for the games in question and did not apply to games where the publisher had not received Steam keys.

The second counter-argument would be that Tom Giardino was not speaking to Valve’s actual policy and/or that he was making empty threats that he didn’t have the power to enforce. Tom’s still with Valve (according to www.valvesoftware.com/en/people) so they wouldn’t be able to show that he was fired for giving publishers incorrect information, but it would be feasible for them to have record of him having gotten disciplinary action or something along those lines. Without something like that it’s much less credible stance, but not unbelievable - they’d basically have to be admitting negligence since this is a record of the actions of a representative their company. My gut says they were at least complicit.

200 says Valve “insisted” a publisher change their price on the Discord Store but doesn’t indicate any enforcement action was taken. At first glance, 209 appeared to apply, but it, too, involves the sale of Steam keys. 230 goes into a bit more detail about 209.

I read through the filing and still don’t see any instances of a game being delisted because it was being sold for cheaper elsewhere, when Steam keys weren’t in play. A lack of enforcement action against publishers not using Steam keys who set a different price in another storefront would go a long way toward showing that Valve’s policy was only relevant when the publishers were using Steam keys.

In either case, Valve will need to make the argument that it is not anti-competitive to require publishers to agree to these terms when requesting free Steam keys.

The arguments regarding DLC exclusivity (172-184) are another area where Valve might be found to be anti-competitive. That said, I don’t think exclusive DLCs benefit consumers and I would expect Valve to argue that the intent and impact of requiring DLC be published on their platform is for consumers’ benefit. I think proving something here would be dependent on the pricing angle.

I still think Valve could argue that the intent and impact of their pricing decisions are to the benefit of consumers. The specific enforcement actions brought up were all in relation to the price of Steam keys on third-party storefronts, which I think will be held to a much lower standard than restricting the price of the game on other platforms. After all, the benefits of Steam keys aren’t intrinsic to Steam, and other platforms are free to offer a similar benefit to game publishers.

In 191, the plaintiff shows that a publisher could set the price on a rival platform at 20% less and make more profit than on Steam. However, there aren’t any examples of enforcement actions where the discount on a rival platform did not exceed a 20% difference. Ultimately, if they don’t have at least that - optimally for a game whose publisher didn’t ever receive free Steam keys - the singular statement of one of their representatives might be the only concrete evidence they have. And at that point, the argument that Tom was just making empty threats has a lot more weight.

hedgehog ,

ultimately the market is behaving as if the threats are sincere so whether or not Valve would follow through is irrelevent to whether the presence of a policy is an exhibition of monopolistic power

Courts have interpreted the anti-monopoly portion of the Sherman act, which governs antitrust law in the US, to mean that monopoly is only unlawful if the power is used in an unlawful way or if the monopoly was acquired through unlawful means.

The need to see an actual example of a game being delisted for violation of the policy is a weirdly high standard of evidence

As a smoking gun, I don’t think it’s unreasonable to ask for something like that.

If it’s a policy Valve denies and the only evidence of it existing is a single reply in a forum somewhere, then yes, I’m skeptical. And given that there are examples of companies that were willing to break explicit, defensible policies, why aren’t there examples of companies who broke these? Unless the plaintiffs bring in multiple witnesses to testify that this was the policy communicated to them or something along those lines, I can’t see the evidence that they did have this policy being more compelling than the fact that there’s a complete lack of evidence that they ever acted on it.

To be clear, I’m not saying Valve needs to have said that was the reason. But it certainly needs to look like that was the reason. If Valve can’t provide a valid reason for the termination, then that’s very compelling, and even if they can, it’ll come down to which is more believable.

ajsadauskas , to technology
@ajsadauskas@aus.social avatar

It's time to call a spade a spade. ChatGPT isn't just hallucinating. It's a bullshit machine.

From TFA (thanks @mxtiffanyleigh for sharing):

"Bullshit is 'any utterance produced where a speaker has indifference towards the truth of the utterance'. That explanation, in turn, is divided into two "species": hard bullshit, which occurs when there is an agenda to mislead, or soft bullshit, which is uttered without agenda.

"ChatGPT is at minimum a soft bullshitter or a bullshit machine, because if it is not an agent then it can neither hold any attitudes towards truth nor towards deceiving hearers about its (or, perhaps more properly, its users') agenda."

https://futurism.com/the-byte/researchers-ai-chatgpt-hallucinations-terminology

@technology

hedgehog ,

reasonable expectations and uses for LLMs.

LLMs are only ever going to be a single component of an AI system. We’ve only had LLMs with their current capabilities for a very short time period, so the research and experimentation to find optimal system patterns, given the capabilities of LLMs, has necessarily been limited.

I personally believe it’s possible, but we need to get vendors and managers to stop trying to sprinkle “AI” in everything like some goddamn Good Idea Fairy.

That’s a separate problem. Unless it results in decreased research into improving the systems that leverage LLMs, e.g., by resulting in pervasive negative AI sentiment, it won’t have a negative on the progress of the research. Rather the opposite, in fact, as seeing which uses of AI are successful and which are not (success here being measured by customer acceptance and interest, not by the AI’s efficacy) is information that can help direct and inspire research avenues.

LLMs are good for providing answers to well defined problems which can be answered with existing documentation.

Clarification: LLMs are not reliable at this task, but we have patterns for systems that leverage LLMs that are much better at it, thanks to techniques like RAG, supervisor LLMs, etc…

When the problem is poorly defined and/or the answer isn’t as well documented or has a lot of nuance, they then do a spectacular job of generating bullshit.

TBH, so would a random person in such a situation (if they produced anything at all).

As an example: how often have you heard about a company’s marketing departments over-hyping their upcoming product, resulting in unmet consumer expectation, a ton of extra work from the product’s developers and engineers, or both? This is because those marketers don’t really understand the product - either because they don’t have the information, didn’t read it, because they got conflicting information, or because the information they have is written for a different audience - i.e., a developer, not a marketer - and the nuance is lost in translation.

At the company level, you can structure a system that marketers work within that will result in them providing more correct information. That starts with them being given all of the correct information in the first place. However, even then, the marketer won’t be solving problems like a developer. But if you ask them to write some copy to describe the product, or write up a commercial script where the product is used, or something along those lines, they can do that.

And yet the marketer role here is still more complex than our existing AI systems, but those systems are already incorporating patterns very similar to those that a marketer uses day-to-day. And AI researchers - academic, corporate, and hobbyists - are looking into more ways that this can be done.

If we want an AI system to be able to solve problems more reliably, we have to, at minimum:

  • break down the problems into more consumable parts
  • ensure that components are asked to solve problems they’re well-suited for, which means that we won’t be using an LLM - or even necessarily an AI solution at all - for every problem type that the system solves
  • have a feedback loop / review process built into the system

In terms of what they can accept as input, LLMs have a huge amount of flexibility - much higher than what they appear to be good at and much, much higher than what they’re actually good at. They’re a compelling hammer. System designers need to not just be aware of which problems are nails and which are screws or unpainted wood or something else entirely, but also ensure that the systems can perform that identification on their own.

hedgehog ,

It’s more like paying the ticket without ever showing up in court. And at least where I live, I can do that.

hedgehog ,

The news sites can cover whatever they want. If their readers consume it, great - they’re writing to audience. Doesn’t mean we can’t criticize it when it gets posted here.

hedgehog ,

If you’re talking about a stock Android OS on anything other than a Pixel, iOS wins in both regards. Stock on a Pixel, I don’t know that Apple is more secure, but if you’re installing apps via Google Play that use Google Play Services, iOS is certainly more private. Vs GrapheneOS on a Pixel, iOS is less private by far.

hedgehog ,

Better than bad is still “better.”

hedgehog ,

You think that Google Play Services is FOSS? Or that the version of Android on Samsung phones (as well as of most other Android phone manufacturers), including all baked in software, is FOSS?

hedgehog ,

And when you’re comparing two closed source options, there are techniques available to evaluate them. Based off the results of people who have published their results from using these techniques, Apple is not as private as they claim. This is most egregious when it comes to first party apps, which is concerning. However, when it comes to using any non-Apple app, they’re much better than Google is when using any non-Google app.

There’s enough overlap in skillset that pretty much anyone performing those evaluations will likely find it trivial to configure Android to be privacy-respecting - i.e., by using GrapheneOS on a Pixel or some other custom ROM - but most users are not going to do that.

And if someone is not going to do that, Android is worse for their privacy.

It doesn’t make sense to say “iPhones are worse at respecting user privacy than Android phones” when by default and in practice for most people, the opposite is true. What we should be saying is “iPhones are better at respecting privacy by default, but if privacy is important to you, the best option is to put in a bit of extra work and install GrapheneOS on a Pixel.”

hedgehog ,

Have you looked into configuring them directly from your NVR? Or third party options? I did a quick search and saw a list of several that as far as I can tell can display Reolink streams (though I haven’t confirmed any can configure the cameras):

And some proprietary options that have native Linux builds:

hedgehog ,

Apparently it’s still being actively developed! I’m impressed.

April 15, 2024 Lynx v2.9.1 release

hedgehog ,

The list of instances you shared was updated recently, but I tried the one url in it (the rest are onion links or i2p, and are older versions of libreddit to boot) and the page didn’t even load.

Libreddit has been discontinued for nearly a year due to not working thanks to Reddit’s API changes, though about a month ago they updated their repo to direct people to RedLib, which allegedly does work. That said, I tried the official instance and got an error. However, it’s being actively developed and looks easy to self-host. I don’t know if there’s a list of unofficial public instances.

hedgehog ,

The dice method is great. www.eff.org/dice

hedgehog ,

Being a bit pedantic here, but I doubt this is because they trained their model on the entire internet. More likely they added Reddit and many other sites to an index that can be referenced by the LLM and they don’t have enough safeguards in place. Look up “RAG” (Retrieval-augmented generation) if you want to learn more.

hedgehog ,

Sure, and that’s roughly the same amount of entropy as a 13 character randomly generated mixed case alphanumeric password. I’ve run into more password validation prohibiting a 13 character password for being too long than for being too short, and for end-user passwords I can’t recall an instance where 77.5 bits of entropy was insufficient.

But if you disagree - when do you think 77.5 bits of entropy is insufficient for an end-user? And what process for password generation can you name that has higher entropy and is still easily memorized by users?

hedgehog ,

Ah, fair enough. I was just giving people interested in that method a resource to learn more about it.

The problem is that your method doesn’t consistently generate memorable passwords with anywhere near 77 bits of entropy.

First, the example you gave ended up being 11 characters long. For a completely random password using alphanumeric characters + punctuation, that’s 66.5 bits of entropy. Your lower bound was 8 characters, which is even worse (48 bits of entropy). And when you consider that the process will result in some letters being much more probable, particularly in certain positions, that results in a more vulnerable process. I’m not sure how much that reduces the entropy, but it would have an impact. And that’s without exploiting the fact that you’re using quoted as part of your process.

The quote selection part is the real problem. If someone knows your quote and your process, game over, as the number of remaining possibilities at that point is quite low - maybe a thousand? That’s worse than just adding a word with the dice method. So quote selection is key.

But how many quotes is a user likely to select from? My guess is that most users would be picking from a set of fewer than 7,776 quotes, but your set and my set would be different. Even so, I doubt that the set an attacker would need to discern from is higher than 470 billion quotes (the equivalent of three dice method words), and it’s certainly not 2.8 quintillion quotes (the equivalent of 5 dice method words).

If your method were used for a one-off, you could use a poorly known quote and maybe have it not be in that 470 billion quote set, but that won’t remain true at scale. It certainly wouldn’t be feasible to have a set of 2.8 quintillion quotes, which means that even a 20 character password has less than 77.5 bits of entropy.

Realistically, since the user is choosing a memorable quote, we could probably find a lot of them in a very short list - on the order of thousands at best. Even with 1 million quotes to choose from, that’s at best 30 bits of entropy. And again, user choice is a problem, as user choice doesn’t result in fully random selections.

If you’re randomly selecting from a 60 million quote database, then that’s still only 36 bits of entropy. When the database has 470 billion quotes, that’ll get you to 49 bits of entropy - but good luck ensuring that all 470 billion quotes are memorable.

There are also things you can do, at an individual level, to make dice method passwords stronger or more suitable to a purpose. You can modify the word lists, for one. You can use the other lists. When it comes to password length restrictions, you can use the EFF short list #2 and truncate words after the third character without losing entropy - meaning your 8 word password only needs to be 32 characters long, or 24 characters, if you omit word separators. You can randomly insert a symbol and a number and/or substitute them, sacrificing memorizability for a bit more entropy (mainly useful when there are short password length limits).

The dice method also has baked-in flexibility when it comes to the necessary level of entropy. If you need more than 82 bits of entropy, just add more words. If you’re okay with having less entropy, you can generate shorter passwords - 62 bits of entropy is achieved with a 6 short-word password (which can be reduced to 18 characters) and a 4 short-word password - minimum 12 characters - still has 41 bits of entropy.

With your method, you could choose longer quotes for applications you want to be more secure or shorter quotes for ones where that’s less important, but that reduces entropy overall by reducing the set of quotes you can choose from. What you’d want to do is to have a larger set of quotes for your more critical passwords. But as we already showed, unless you have an impossibly huge quote database, you can’t generate high entropy passwords with this method anyway. You could select multiple unrelated quotes, sure - two quotes selected from a list of 10 billion gives you 76.4 bits of entropy - but that’s the starting point for the much easier to memorize, much easier to generate, dice method password. You’ve also ended up with a password that’s just as long - up to 40 characters - and much harder to type.

This problem is even worse with the method that the EFF proposes, as it’ll output passphrases with an average of 42 characters, all of them alphabetic.

Yes, but as pass phrases become more common, sites restricting password length become less common. My point wasn’t that this was a problem but that many site operators felt that it was fine to cap their users’ passwords’ max entropy at lower than 77.5 bits, and few applications require more than that much entropy. (Those applications, for what it’s worth, generally use randomly generated keys rather than relying on user-generated ones.)

And, as I outlined above, you can use the truncated short words #2 list method to generate short but memorable passwords when limited in this way. My general recommendation in this situation is to use a password manager for those passwords and to generate a high entropy, completely random password for them, rather than trying to memorize them. But if you’re opposed to password managers for some reason, the dice method is still a great option.

hedgehog ,

Just sharing this link to another comment I made replying to you, since it addresses your calculations regarding entropy: ttrpg.network/comment/7142027

hedgehog ,

I recommend Tidal over Spotify, personally

hedgehog ,

Why should shadow bans be illegal?

hedgehog ,

Because a good person would never need those. If you want to have shadowbans on your platform, you are not a good one.

This basically reads as “shadow bans are bad and have no redeeming factors,” but you haven’t explained why you think that.

If you’re a real user and you only have one account (or have multiple legitimate accounts) and you get shadow-banned, it’s a terrible experience. Shadow bans should never be used on “real” users even if they break the ToS, and IME, they generally aren’t. That’s because shadow bans solve a different problem.

In content moderation, if a user posts something that’s unacceptable on your platform, generally speaking, you want to remove it as soon as possible. Depending on how bad the content they posted was, or how frequently they post unacceptable content, you will want to take additional measures. For example, if someone posts child pornography, you will most likely ban them and then (as required by law) report all details you have on them and their problematic posts to the authorities.

Where this gets tricky, though, is with bots and multiple accounts.

If someone is making multiple accounts for your site - whether by hand or with bots - and using them to post unacceptable content, how do you stop that?

Your site has a lot of users, and bad actors aren’t limited to only having one account per real person. A single person - let’s call them a “Bot Overlord” - could run thousands of accounts - and it’s even easier for them to do this if those accounts can only be banned with manual intervention. You want to remove any content the Bot Overlord’s bots post and stop them from posting more as soon as you realize what they’re doing. Scaling up your human moderators isn’t reasonable, because the Bot Overlord can easily outscale you - you need an automated solution.

Suppose you build an algorithm that detects bots with incredible accuracy - 0% false positives and an estimated 1% false negatives. Great! Then, you set your system up to automatically ban detected bots.

A couple days later, your algorithm’s accuracy has dropped - from 1% false negatives to 10%. 10 times as many bots are making it past your algorithm. A few days after that, it gets even worse - first 20%, then 30%, then 50%, and eventually 90% of bots are bypassing your detection algorithm.

You can update your algorithm, but the same thing keeps happening. You’re stuck in an eternal game of cat and mouse - and you’re losing.

What gives? Well, you made a huge mistake when you set the system up to ban bots immediately. In your system, as soon as a bot gets banned, the bot creator knows. Since you’re banning every bot you detect as soon as you detect them, this gives the bot creator real-time data. They can basically reverse engineer your unpublished algorithm and then update their bots so as to avoid detection.

One solution to this is ban waves. Those work by detecting bots (or cheaters, in the context of online games) and then holding off on banning them until you can ban them all at once.

Great! Now the Bot Overlord will have much more trouble reverse-engineering your algorithm. They won’t know specifically when a bot was detected, just that it was detected within a certain window - between its creation and ban date.

But there’s still a problem. You need to minimize the damage the Bot Overlord’s accounts can do between when you detect them and when you ban them.

You could try shortening the time between ban waves. The problem with this approach is that the ban wave approach is more effective the longer that time period is. If you had an hourly ban wave, for example, the Bot Overlord could test a bunch of stuff out and get feedback every hour.

Shadow bans are one natural solution to this problem. That way, as soon as you detect it, you can prevent a bot from causing more damage. The Bot Overlord can’t quickly detect that their account was shadow-banned, so their bots will keep functioning, giving you more information about the Bot Overlord’s system and allowing you to refine your algorithm to be even more effective in the future, rather than the other way around.

I’m not aware of another way to effectively manage this issue. Do you have a counter-proposal?

Out of curiosity, do you have any experience working in content moderation for a major social media company? If so, how did that company balance respecting user privacy with effective content moderation without shadow bans, accounting for the factors I talked about above?

hedgehog ,

But major social media companies do exist. If your real point was that they shouldn’t, you should have said that upfront.

hedgehog ,

That’s a bit abstract, but saying what others “should” do is both stupid and rude.

Buddy, if anyone’s being stupid and rude in this exchange, it’s not me.

And any true statement is the same as all other true statements in an interconnected world.

It sounds like the interconnected world you’re referring to is entirely in your own head, with logic that you’re not able or willing to share with others.

Even if I accepted that you were right - and I don’t accept that, to be clear - your statements would still be nonsensical given that you’re making them without any effort to clarify why you think them. That makes me think you don’t understand why you think them - and if you don’t understand why you think something, how can you be so confident that you’re correct?

hedgehog ,

No, I don’t think anything you do has any bearing on reality, period.

Problems with creating my own instance

I am currently trying to create my own Lemmy instance and am following the join-lemmy.org docker guide. But unfortunately docker compose up doesn’t work with the default config and throw’s a yaml: line 32: found character that cannot start any token error. Is there something I can do to fix this?...

hedgehog ,

If you use that docker compose file, I recommend you comment out the build section and uncomment the image section in the lemmy service.

I also recommend you use a reverse proxy and Docker networks rather than exposing the postgres instance on port 5433, but if you aren’t familiar with Docker networks you can leave it as is for now. If you’re running locally and don’t open that port in your router’s firewall, it’s a non-issue unless there’s an attacker on your LAN, but given that you’re not gaining anything from exposing it (unless you need to connect to the DB directly regularly - as a one off you could temporarily add the port mapping), it doesn’t make sense to increase your attack surface for no benefit.

hedgehog ,

There’s an idea that just won’t die that Linux is extremely difficult to use/maintain/troubleshoot. It’s certainly often a lot easier than windows, so it just gets to me to see that idea propagated.

Pretending it’s all sunshine and rainbows isn’t realistic, either. That said, I had a completely different takeaway - that the issues are mostly kinda random and obscure or nitpicky, and the sorts of things you would encounter in any mature OS.

The issue about PopOS not having a Paint application is actually the most mainstream of them - and it feels very similar to the complaints about iPadOS not including a Calculator app by default. But nobody is concluding that iPads aren’t usable as a result.

Teams having issues is believable and relevant to many users. It doesn’t matter whose fault an issue is if the user is impacted. TBH, I didn’t even know that Teams was available on Linux.

That said, the only people who should care about Teams issues on Linux are the ones who need to use them, and anyone who’s used Microsoft products understands that they’re buggy regardless of the platform. Teams has issues on MacOS, too. OneDrive has issues on MacOS. On Windows 10, you can’t even use a local account with Office 365.

hedgehog ,

Theoretically, they would start by looking at a guide like this one: rentry.org/Piracy-BG

hedgehog ,

Definitely not, I do the same.

I installed 64 GB of RAM in my Windows laptop 4 years ago and had been using 64 GB of RAM in the laptop that it replaced - which was from 2013 (I think I bought it in 2014-2105). I was using 32 GB of RAM prior (on Linux and Windows laptops), all the way back to 2007 or so.

My work MacBook Pros generally have 32-64 GB of RAM, but my personal MacBook Air (the 15” M2) has 16 GB, simply because the upgrade wasn’t a cost effective one (and the M1 before it had performed great with 16) and because I’d only planned on using it for casual development. But since I’ve been using it as my main personal development machine and for self-hosted AI, and have run into its limits, when I replace it I’ll likely opt for 64 GB or more.

My Windows gaming desktop only has 32 GB of RAM, though - that’s because getting the timings higher with more RAM - particularly 4 sticks - was prohibitively expensive when I built it, and then when the cost wasn’t a concern and I tried to upgrade, I learned that my third and fourth RAM slots weren’t functional. I could upgrade to 64 GB in two slots but it wouldn’t really be worth it, since I only use it for gaming.

My Linux desktop / server has 128 GB of ECC RAM, though, because that’s as much as the motherboard supported.

hedgehog ,

I’m not the person you responded to, but I can say that it’s a perfectly fine take. My personal experience and the commonly voiced opinions about both browsers supports this take.

Unless you’re using 5 tabs max at a time, my personal experience is that Firefox is more than an order of magnitude more memory efficient than Chrome when dealing with long-lived sessions with the same number of tabs (dozens up to thousands).

I keep hundreds of tabs open in Firefox on my personal machine (with 16 GB of RAM) and it’s almost never consuming the most memory on my system.

Policy prohibits me running Firefox on my work computer, so I have to use Chrome. Even with much more memory (both on 32 GB and 64 GB machines) and far fewer tabs (20-30 at most vs 200-300), Chrome often ends up taking up far too much memory + having a substantial performance drop, and I have to to through and prune the tabs I don’t need right now, bookmark things that can be done later, etc…

Also, see techspot.com/…/102871-zero-regrets-firefox-power-… - I’ve never seen anything similar for Chrome and wasn’t able to find anything.

hedgehog ,

It first showed up on Netflix in mid-2023, in the middle of the writer’s guild strike (meaning there was a dearth of new content). So basically the Netflix effect. It had been on other streaming platforms before - Prime Video and Hulu - but Netflix is still a juggernaut compared to them - it has 5 times as many subscribers as Hulu, for example, and many of the subscribers to Prime Video are incidental and don’t stream as much on average as Netflix users.

I assume Netflix funded off-platform advertising, but the on-platform advertising has a big effect, too. And given that Suits broke a record in the first week it was on Netflix and they have a spinoff coming, it makes sense that they would keep advertising.

hedgehog ,

The funny thing about Lemmy is that the entire Fediverse is basically running a massive copyright violation ring with current copyright law.

Is it, though?

When someone posts a comment to Lemmy, they do so willingly, with the intent for it to be posted and federated. If they change their mind, they can delete it. If they delete it and it remains up somewhere, they can submit a DMCA request; likewise if someone else posts their copyrighted content.

Copyright infringement is the use of works protected by copyright without permission for their use. When you submit a post or a comment, your permission to display it and for it to be federated is implied, because that is how Lemmy works. A license also conveys permission, but that’s not the only way permission can be conveyed.

hedgehog ,

The idea that someone does this willingly implies that the user knows the implications of their choice, which most of the Fediverse doesn’t seem to do

The terms of service for lemmy.world, which you must agree to upon sign-up, make reference to federating. If you don’t know what that means, it’s your responsibility to look it up and understand it. I assume other instances have similar sign-up processes. The source code to Lemmy is also available, meaning that a full understanding is available to anyone willing to take the time to read through the code, unlike with most social media companies.

What sorts of implications of the choice to post to Lemmy do you think that people don’t understand, that people who post to Facebook do understand?

If the implied license was enough, Facebook and all the other companies wouldn’t put these disclaimers in their terms of service.

It’s not an implied license. It’s implied permission. And if you post content to a website that’s hosting and displaying such content, it’s obvious what’s about to happen with it. Please try telling a judge that you didn’t understand what you were doing, sued without first trying to delete or file a DMCA notice, and see if that judge sides with you.

Many companies have lengthy terms of service with a ton of CYA legalese that does nothing. Even so, an explicit license to your content in the terms of service does do something - but that doesn’t mean that you’re infringing copyright without it. If my artist friend asks me to take her art piece to a copy shop and to get a hundred prints made for her, I’m not infringing copyright then, either, nor is the copy shop. If I did that without permission, on the other hand, I would be. If her lawyer got wind of this and filed a suit against me without checking with her and I showed the judge the text saying “Hey hedgehog, could you do me a favor and…,” what do you think he’d say?

Besides, Facebook does things that Lemmy instances don’t do. Facebook’s codebase isn’t open, and they’d like to reserve the ability to do different things with the content you submit. Facebook wants to be able to do non-obvious things with your content. Facebook is incorporated in California and has a value in the hundreds of billions, but Lemmy instances are located all over the world and I doubt any have a value even in the millions.

hedgehog ,

They don’t call them “mp3 players” anymore - that may be why you can’t find what you need. Look for a “DAP” instead - digital audio player - and you’ll probably have more luck.

For example, the Fiio M7 is $200 and is pretty full-featured. I have the M6 and I think I paid around $100, but I don’t think it’s being sold anymore.

hedgehog ,

You can use YaCy, which can be run as an independent self-hosted index (in “Local” mode), where it will index sites visited as part of web crawls that you initiate, or you can run it as part of a decentralized peer-to-peer network of indexes.

YaCy has its own search UI but you can also set up SearXNG to use it.

hedgehog ,

there is not a ‘Searx Index’ which is what this is about.

There’s YaCy, which includes a search index (which can be independent or can join a P2P network of indexes), web crawler, and web ui for searching. It can also be added as a SearXNG engine.

hedgehog ,

Unless you’re using a random 10+ alphanumeric passcode and are fine entering it every time you log into your phone, with a short auto-lock period, you’re much better off enabling biometrics (assuming it’s implemented competently) in combination with a longer passcode and understanding how to disable it when appropriate.

I recently replied with this comment to a Gizmodo article recommending the same thing you did for similar reasons, if you’d like to better understand my rationale: ttrpg.network/comment/6620188

hedgehog ,

I can’t speak to Android as a whole, but here’s how often Samsung Face Unlock will require you to re-auth with your phone’s passcode:

  • after 4 hours of not using the phone
  • after restarting
  • at least once every 24 hours

iPhones do something similar, but it’s after 48 hours of non-use (instead of 4) and at least weekly instead of daily. Having to enter your password daily should help most people keep it memorized pretty well, but weekly - maybe not. So you definitely have a good point there.

One thing that can make it easier to remember - and just as secure - is to use a longer pass phrase instead of random characters.

If you using the diceware approach (“correct horse battery staple”), then 5 words has 32 times / 5 bits more entropy than a 10 character mixed-case alphanumeric password (64 vs 59 bits of entropy) (4 word passphrases aren’t random enough to be recommended - they have fewer bits of entropy (51) than even 9 character mixed-case alphanumeric passwords (53), though notably 10 same-case alphanumeric characters also have only 51 bits of entropy).

The EFF has a word list that’s been improved for usability. They also have a short list, comprised of words with at most 5 characters each, where you roll 4 dice instead of 5. With 6 words from that list you get 62 bits of entropy, which is good enough to be able to recommend.

hedgehog ,

I haven’t used it and only heard about it while writing this post, but Open WebUI looks really promising. I’m going to check it out the next time I mess with my home server’s AI apps. If you want more options, read on.

Disclaimer: I’ve looked into most of the options below enough to feel comfortable recommending them, but I’ve only personally self hosted the Automatic 1111 webui, the Oobabooga webui, and Kobold.cpp.

If you want just an LLM and an image generator, then:

For the image generator, something that leverages Stable Diffusion models:

And then find models that you like at Civitai.

For the LLM, the best option depends on your hardware. Not knowing anything about your hardware, I recommend a llama.cpp based solution. Check out one of these:

Alternatively, VLLM is allegedly the fastest for multi-user CPU-based inference, though as far as I can tell it doesn’t have its own webui (but it does expose OpenAI compatible API endpoints).

And then find a model you like at Huggingface. I recommend finding a model quantized by TheBloke.

There are a couple communities not on Lemmy that discuss local LLMs - r/LocalLLaMA and r/LocalLLM for example - so if you’re trying to figure out which model to try, that’s a good place to check.

If you want a multimodal AI, you can use llama.cpp with a model like LLAVA. The options below also have multimodal support.

If you want an AI assistant with expanded capabilities - like searching your documents or the web (RAG), etc. - then I don’t have a ton of experience there, but these seem to do that job:

If you want to use your local model as more than just a chat bot - integrating it into your IDE or a browser extension - then there are options there, and as far as I know every LLM above can be configured to expose an API allowing it to be used by your other tools. Some, like Open WebUI, expose OpenAI compatible APIs and so can be used with tools built to be used with OpenAI. I don’t know of many tools like this, though - I was surprisingly not able to find a browser extension that could use your own API, for example. Here are a couple examples:

Also, I found this Medium article listed some of the things I described above as well as several others that I’d never heard of.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • lifeLocal
  • goranko
  • All magazines