You start out by bemoaning the onboarding experience and then move on to portability and then speak up the idea servers should just be relays and browsers should be the new world order.
Yes, onboarding definitely needs to be improved.
Yes, portability can be improved. Lemmy falls short of Mastodon and not even Mastodon is perfect.
But, what mastodon does so is foster does do excellently is foster the idea that social media is a tool and that users shouldn’t be overly attached. Also, perhaps if we learn to value servers, so not treat them as mere relays, perhaps we’ll be able to teach value and independence.
The problem is, too many people keep trying to think, how can we make the Fediverse relevant in the modern world? And the better question is, how can we redefine the modern world? How can we normalize the idea of cooperative servers? Whether friends, towns, cities, etc. How can we make it so the people running the servers that host our communities are committed and engaged and not running them at a deficit? I would even go as far as to say that there should be government schemes to repurpose old computers into mini servers and that governments should give everyone a domain like NAME.TOWN.CITY and everyone can run a personal server and get used to it and then they can grow from there.
i really disagree with most of your points. a “server” is some machine working for the client. your proposal isn’t getting rid of servers, you’re just making every user responsible to be their own server.
this mostly feels like “im annoyed my instance is filtering content and lacks replies”. have you tried fedilab? it allows fetching directly from source, bypassing your instance and fetching all replies. i think thats kind of anti-privacy but you may like it
if you’re interested here’s a wall of text with more argumentations on my points (sorry wanted to be concise but really failed, i may make this into a reply blog post soon:tm:)
Federation is not the natural unit of social organization
you argue that onboarding is hard, as if picking a server is signing a contract. new users can go to mastosoc and then migrate from there. AP has a great migration system. also federation is somewhat the natural unit: you will never speak to all 8B people, but you will discuss with your local peers and your ideas may get diffused. somewhat fair points, but kind of overblown
Servers are expensive to operate
you really can’t get around this, even if you make every user handle their own stuff, every user will have their database and message queue. every user will receive such post in their message queue, process it and cache in their db. that’s such a wasteful design: you’re replicating once for every member of the network
We should not need to emulate the fragmentation of closed social networks
absolutely true! this should get handled by software implementers, AP already allows intercompatibility, we don’t need a different system, just better fedi software
The server is the wrong place for application logic
this is really wrong imo, and the crux of my critic. most of your complaints boil down to caching: you only see posts cached on a profile and in a conversation. this can’t be different, how could we solve it?
you mention a global search: how do we do that? a central silo which holds all posts ever made, indexed to search? who would run such a monster, and if it existed, why wouldn’t everyone just connect there to have the best experience? that’s centralization
again global search: should all servers ask all other servers? who keeps a list of all servers? again centralized, and also such a waste of resources: every query you’re invoking all fedi servers to answer?
even worse you mention keeping everything on the client, but how do you do that? my fedi instance db is around 30G, and im a single user instance which only sees posts from my follows, definitely not a global db. is every user supposed to store hundreds of GBs to have their “local global db” to search on? why not keep our “local global dbs” shared in one location so that we deduplicate posts and can contribute to archiving? something like a common server for me and my friends?
also if the client is responsible of keeping all its data, how do you sync across devices? in some other reply you mention couchdb and pouchdb, but that sounds silly for fedi: if we are 10 users should we all host our pouchdb on a server, each with the same 10 posts? wouldn’t it be better keeping the posts once and serving them on demand? you save storage on both the server and all clients and get the exact same result
having local dbs for each client also wouldn’t solve broken threads or profiles: each client still needs to see each reply or old post. imagine if every fedi user fetched every old post every time they follow someone, that would be a constant DOS. by having one big server shared across multiple people you’re increasing your chance of finding replies already cached, rather than having to go fetch them
last security: you are assuming a well intentioned fedi but there are bad actors. i don’t want my end device to connect to every instance under the sun. i made a server, which only holds fedi stuff, which at worst will crash or leak private posts. my phone/pc holds my mails and payment methods, am i supposed to just go fetching far and wide from my personal device as soon as someone delivers me an activity? no fucking way! the server is a layer of defense
networks are smarter at the edges
the C2S AP api is really just a way to post, not much different than using madtodon api. as said before content discovery on every client is madness, but timeline/filter managenent is absolutely possible. is it really desirable? megalodon app allows to manage local filters for your timeline, but that’s kind of annoying because you end up with out of sync filters between multiple devices. same for timelines: i like my lists synched honestly, but to each their own, filters/timelines on the client should already be possible.
you mention cheaper servers but only because you’re delegating costs to each client, and the “no storage” idea is in conflict with the couchdb thing you mentioned somewhere else. servers should cache, caching is more efficient on a server than on every client.
a social web browser, built into the browser
im not sure what you’re pitching here. how are AP documents served to other instances from your browser? does your browser need to deliver activities to other instances? is your whole post history just stored in localstorage, deleted if you clear site data? are you supposed to still buy a domain (AP wants domains as identities) and where are you going to point it?
I have not once said that we need to get rid of servers, but I am saying that they could (should?) be used only as an proxy for the outbox/inbox. I’ve said already elsewhere, but it may make it easier to understand: the “ideal” model I have in mind is something like movim.eu, but with messages based around the ActivityStream vocabulary.
you really can’t get around this, even if you make every user handle their own stuff, every user will have their database and message queue.
Hard disagree, here. Tell me one system where I can take my domain and just swap the urls of the inbox/outbox. Mastodon lets you migrate your follower list and signals the redirect to your followers about your new actor ID, but you can not bring your data. But most importantly, the identity itself is not portable.
silo which holds all posts ever made, indexed to search? (…) that’s centralization
i don’t want my end device to connect to every instance under the sun.
Not every instance, but you’d be connecting to the outboxes from the people you follow. How is that different from, e.g, subscribing to a RSS feed?
my fedi instance db is around 30G, and im a single user instance which only sees posts from my follows
First: How the hell did you get this much data? :) I have an instance running for 4 years, with a bunch of relays, serving ~10 users and the DB has less than 4GB.
But to answer your question: If you are running a single-user instance, then you are already running a client, the only difference is that you are running on a remote machine which proxies everything for you. And how you deal with data wouldn’t change: just like you can delete old/stale data in Mastodon, you’d be able to delete or offload messages that are older than X days.
Despite the privacy concerns, Microsoft says that the Recall index remains local and private on-device, encrypted in a way that is linked to a particular user’s account.
Just like how Microsoft domain-bound emails were stored locally on machines running Outlook, right? Or how purchasing and downloading music, movies, and video games meant that we owned them, right?
I don’t believe for a fucking second that this “feature” will remain locally encrypted forever. Fuck Microsoft, fuck the AI bubble.
“Don’t be evil!
…
wait, you say you’ll pay me to be evil? Well fuck that changes everything!”
I had a similar idea: Could search engines be broken up and distributed instead of being just a couple of monoliths?
Reading the HN thread, the short answer is: NO.
Still, its fun to imagine what it might look like if only…
I think the OP is looking for an answer to the problem of Google having a monopoly that gives them the power to make it impossible to be challenged. The cost to replicate their search service is just so astronomical that its basically impossible to replace them. Would the OP be satisfied if we could make cheaper components that all fit together to make a competing but decentralized search service? Breaking down the technical problems is just the first step, the basic concepts for me are:
Crawling -> Indexing -> Storing/host index -> Ranking
All of them are expensive because the internet is massive! If each of these were isolated but still interoperable then we get some interesting possibilities: Basically you could have many smaller specialized companies that can focus on better ranking algorithms for example.
What if crawling was done by the owners of each website and then submitted to an index database of their choice? This flips the model around so things like robots.txt might become less relevant. Bad actors and spam however now don’t need any SEO tricks to flood a database or mislead as to their actual content, they can just submit whatever they like!. These concerns feed into the next step:
What if there were standard indexing functions similar to how you have many standard hash functions. How a site is indexed plays an important role in how ranking will work (or not) later. You could have a handful of popular general purpose index algorithms that most sites would produce and then submit (e.g. keywords, images, podcasts, etc.) combined with many more domain specific indexing algorithms (e.g. product listings, travel data, mapping, research). Also if the functions were open standards then it would be possible for a browser to run the index function on the current page and compare the result to the submitted index listing. It could warn users that the page they are viewing is probably either spam or misconfigured in some way to make the index not match what was submitted.
What if the stored indexes were hosted in a distributed way similar to DNS? Sharing the database would lower individual costs. Companies with bigger budgets could replicate the database to provide their users with a faster service. Companies with fewer resources would be able to use the publicly available indexes yet still be competitive.
Enabling more competition between different ranking methods will hopefully reduce the effectiveness of SEO gaming (or maybe make it worse as the same content is repackaged for each and every index/rank combination). Ranking could happen locally (although this would probably not be efficient at all but that fact that it might even be possible at all is quite a novel thought)
Well stated and explained. I’m not an AI researcher but I develop with LLMs quite a lot right now.
Hallucination is a huge problem we face when we’re trying to use LLMs for non-fiction. It’s a little bit like having a friend who can lie straight-faced and convincingly. You cannot distinguish whether they are telling you the truth or they’re lying until you rely on the output.
I think one of the nearest solutions to this may be the addition of extra layers or observer engines that are very deterministic and trained on only extremely reputable sources, perhaps only peer reviewed trade journals, for example, or sources we deem trustworthy. Unfortunately this could only serve to improve our confidence in the facts, not remove hallucination entirely.
It’s even feasible that we could have multiple observers with different domains of expertise (i.e. training sources) and voting capability to fact check and subjectively rate the LLMs output trustworthiness.
But all this will accomplish short term is to perhaps roll the dice in our favor a bit more often.
The perceived results from the end users however may significantly improve. Consider some human examples: sometimes people disagree with their doctor so they go see another doctor and another until they get the answer they want. Sometimes two very experienced lawyers both look at the facts and disagree.
The system that prevents me from knowingly stating something as true, despite not knowing, without some ability to back up my claims is my reputation and my personal values and ethics. LLMs can only pretend to have those traits when we tell them to.
I’m not a lawyer. But isn’t the reason they had to go to reddit to get permission is because users hand over over ownership to reddit the moment you post. And since there’s no such clause on Lemmy, they’d have to ask the actual authors of the comments for permission instead?
Mind you, I understand there’s no technical limitation that prevents bots from harvesting the data, I’m talking about the legality. After all, public does not equate public domain.
I thought I was going to use Authentik for this purpose but it just seems to redirect to an otherwise Internet accessible page. I’m looking for a way to remotely access my home network at a site like remote.mywebsite.com. I have Nginx proxy forwarding with SSL working appropriately, so I need an internal service that receives...
I use traefik as reverse proxy. I have externally accessible domains for and then extra secure internal only domains that require wireguard connection first as an extra layer of security.
Authentik can be used as a forward auth proxy and doesn’t care if it’s an internal or external domain.
Apps that don’t have good login or user management just get Authentik proxy for single sign on (sonarr, radar etc).
Apps that have oAuth integration get that for single sign on (seafile, immich, etc)
To make it work the video will talk about adding both the internal and external domains to the local DNS so that if you access it from outside it works and if you access from wireguard or inside the lan it also works.
I just set up a Vouch-Proxy for this yesterday. It uses the nginx auth_request directive to authenticate users with an SSO server, and then stores the token in a domain-wide cookie, so you’re logged in across all subdomains. Works pretty well so far, you don’t even notice it when you’re logged in to your SSO provider.
But you do have to tell the proxy where you want to redirect a request somehow, either by subdomain (illegal.yourdomain.com) or port (yourdomain.com:8787) or path (yourdomain.com/illegal). I’m not sure if it works with raw IPs as hosts, but you can add additional restrictions like only allowing local client IPs.
In my special case I’m using the local Synology SSO server, and I have to spin up an additional nginx server because the built-in one doesn’t support auth_request.
Infrastructure used to maintain and distribute the Linux operating system kernel was infected for two years, starting in 2009, by sophisticated malware that managed to get a hold of one of the developers’ most closely guarded resources: the /etc/shadow files that stored encrypted password data for more than 550 system users, researchers said Tuesday.
The unknown attackers behind the compromise infected at least four servers inside kernel.org, the Internet domain underpinning the sprawling Linux development and distribution network, the researchers from security firm ESET said.
After obtaining the cryptographic hashes for 551 user accounts on the network, the attackers were able to convert half into plaintext passwords, likely through password-cracking techniques and the use of an advanced credential-stealing feature built into the malware.
Besides revealing the number of compromised user accounts, representatives of the Linux Kernel Organization provided no details other than saying that the infection:
A 47-page report summarizing Ebury’s 15-year history said that the infection hitting the kernel.org network began in 2009, two years earlier than the domain was previously thought to have been compromised.
Representatives of the Linux Kernel Organization didn’t respond to emails asking if they were aware of the ESET report or if its claims were accurate.
The original article contains 493 words, the summary contains 201 words. Saved 59%. I’m a bot and I’m open source!
In the github issues the dev is aware of this but he’s not completely enraged, just mildly infuriated that the design is too similar and he’s politely asking to have a different design.
From the history in the wayback machine i don’t see any “parking” page between the switch, so my guesswork is that the dev has been approached with an offer like “we like that domain, we would like to buy it for $$$”, unaware that they would copy the design like that in order to achieve maximum deception of users
It would be interesting tho to use a LLM to spot AI/SEO crap and add whole domains to a search blacklist. In that case we wouldn’t need AI to do the actual search, and this could easily just be a database for end users by the SE’s side (kinda like explicit content filters).
I’d call that option “Bullspam filter” and leave it on “moderate” by default.
This is possibly something you could implement in a meta search engine like SearXNG, though there are some privacy concerns.
Maybe it could locally store which domains you personally tend to click (and stay) on. Then automatically raise those domains when it sees them somewhere in the output of the underlying engines. This isn’t perfect because you wouldn’t get data from other users. But I think it could do a lot to improve search results.
I might actually clone the repo and see if I can get somewhere soon
I don’t think that’s possible with searxng (but I’m not 100% sure, but I can’t seem to find that feature)
I know there are browser extensions which can filter out domains in search results for different search engines like google and duckduckgo.
But the pinning/lowering/raising is a bit trickier to implement as an extension, because what kagi does is basically:
Load 3 pages of search results in the backend
Show a result as the first entry if it matches a rule for pinning
Influence the search ranking algorithm with the lower/raise rules of the user
Filter out blocked domains
It would be possible but not as “streamlined” as Kagi does.
Don’t get me wrong, Kagi definitely has its rough edges and the search ranking algorithm is sometimes very unpredictable, but it provides good enough results for me to be worth the 10$ per month for unlimited searches.
It’s really hard to give a review of an email service. It sends mail, it receives mail, and the user interface is pretty enough for what it needs to do.
It’s a bit cheaper than a comparable service from Google Workspace, and it encrypts emails when it’s possible to do so, which is nice.
An extra plus is that it allows you to use your domain as an alternate email, both for receiving and sending. So if you don’t like @proton.me, you can replace it with a domain that you own.
A minus is that you need to install a bridge to use standard SMTP clients like Thunderbird.
All in all, I’m happy, and I’ll continue to use it.
I’d assume one needs to verify the email by clicking a link, so to spam [email protected], [email protected] would mean you need access to those inboxes. That means you need to go through the effort to actually create those emailadresses on whatever freemail service you chose, or you need to host the emailserver yourself and have all mails run into a catchall inbox.
Hosting your own emailserver is definately not “basically no extra effort”, even for a lot of tech-savvy people, paying for a hosted email service using your own domain is easier, but also seems like not a good investment just to spam a petition website.
The [email protected] functionality, however, is pretty well known tool - even by non-tech savvy people. Even some people I know that I consider basically tech-illiterate have known this for years, they have told me when they found out about it and asked me if I was aware of this functionality.
The first one I mentioned requires preparation, setting up email accounts or an email server, the second one is basically already set up for most email users and ready to go, the latter is therefore definately a lot less effort to pull off.
Spamming user+1@gmail, user+2@gmail takes absolutely no technical knowledge whatsoever - anyone can do it with 1 gmail account.
Spamming user1@domain, user2@domain etc requires 1 of two things:
you can sign up for multiple email accounts using a third party service. You’re going to run into trouble with Gmail or other big providers if you start creating accounts en masse.
you create your own email server. this requires someone with selfhosting knowledge and some basic coding (or rather server config) experience.
Your observation touches on a transformative shift in our perception and interaction with content in the digital age. The proliferation of advanced AI technologies, especially those capable of generating text, images, videos, and even audio, has indeed sparked a new form of skepticism among many users.
This change can have both positive and negative implications:
Positive Aspects
Critical Thinking: [on and on]
I understand your concern. The increasing presence of AI-generated content online has understandably led to some wariness and skepticism, as people want to be able to distinguish between human-created and AI-generated information. However, I would caution against overgeneralizing or assuming that anything slightly unusual is necessarily non-human.
Creativity, unique perspectives, and novel ideas are not the sole domain of humans. As AI systems become more advanced, they are able to generate content that can be just as original, thought-provoking, and compelling as human-created work. [on and on]
The robots have weighed in, and they get it - but don’t be too hasty or ignore the positives!
I think OP might’ve been written by a creative writer. Wonder if we’ll still care ten years from now. Or care more…
I’ll give you my thoughts as someone who’s sat in as part of the tech interview portion. And I don’t want what’s below to discourage you as you are seeking that internship which will be great to add and really fill that thing out. An internship should grant you a lot of leniency during the interviews. But I’ll give you my thoughts ignoring that fact, more like what you’d expect once you’re done with the internship and your resume was sent to me for consideration.
What is it you want to do in IT? Generally, I’d say you want to sort of customize the resume to highlight what they might be looking for. For example, for your windows domain section, you list Active Directory and Powershell. Did you add to anything? Create anything? Because if your powershell is just ‘Get-ADUser’ that’s not all that impressive. Writing a script that manages something in AD is. If you can apply something like that and sort of pull out “deployed multiple users” and “updated a GPO”… because both of those things as worded is BAU. It would be weird if you listed “Deployed multiple user accounts in all the wrong OU’s”. You say in their appropriate OUs… how was that determined? What was special for the website, what did it do, what was it’s purpose? You list it but there’s no details so I’d be left to assume you wrote a “Hello World” basic HTML with <html><body><h1>Hello World</h1>… with CSS changing the background to a solid color. What exactly did the bash script do? Why a script and not just a shell command?
You will absolutely need to be ready to fully explain any of these. How did an ArrayList determine the difficulty? (For example). How did you organize the OUs and GPOs? How did you manage the AD server? What was your DR plan and how did you test it? How did you set up the firewall and what authentication was in place for SSH? Were your linux machines on the domain or just the windows machines?
I also see nothing about git or other version control system and that would immediately concern me. You list a few tools (ufw, cron, cerbot) but nothing like VSCode or IntelliJ, GitLab or GitHub. Or whatever tools you used. But again, add/remove things from your resume that best fit the roll you’re applying too. If it’s more programming, stress that more and the administration less (but don’t completely remove it, just work on emphasis). Was any of this a capstone project? Did you participate in any competitions (like Business Professionals of America… but Canadian) or really any competitions? What sets you apart from your classmates for a position at any company (internship or full employment). Also, this might be hit or miss with people, but it might be worth considering dropping the fast food all together from this and add an object about what your seeking from an internship. If asked about no experience, you can just say you have no prior relevant working experience in IT and that’s what your looking to gain from the internship. Because most managers I’ve worked with, would just look right past it since it has none of the keywords they’re looking for and I as the technical person, simply don’t care.
Right now, you have no real world experience with this, just a bit of homework. Highlight what the purpose was of these sections, looping back, what was the purpose of the web site? What did the SSH have to do with the website that you set up HTTPS on? What were you backing up and why? Speaking just for internship, you may want to highlight your certs first and not your irrelevant fast food experience.
Anywho, hope this helps. Getting that first foot in the door can be difficult, but once in and you start networking (not the technical kind, the interpersonal kind) it gets a lot easier. And as a tip, when you first get in a place, that’s the single best time to shine as that’ll propel your career. I don’t mean like do a bunch of free work. But I’ve seen people lose their jobs because they would party every night and then give low effort (or sleep) on the job. Play video games, at the job… taking an absolute excessive amount of smoke breaks (where any given hour they’re at their desk for 15ish minutes)… or do the absolute bare minimum. Express interest in projects (if you actually are), be honest, deliver a product you’d want if you were the customer.
I’m a long time Windows user who has experience with WSL. Last year, I needed a laptop for university, and out of laziness, opted for a Macbook since, although they’re expensive as hell, are reasonably reliable....
I will answer some questions with my old account using gpt 4 to poison the data.
If you want to poison SO a little at the same time providing valid answers that help users, use outlook.com email domain for new accounts. It seems to not have anti throwaway countermeasures while being accepted by SO. And it seems fitting to bash the corporate with the corporate.
I have several domain names, some have websites that I haven’t touched in years, some have email that I equally haven’t touched in years, and some are just ideas that I never followed through on.
I blog about network stuff, because computer networking is a passion of mine, and I get some okay views on it. Some of the posts are about pretty obscure stuff.
I’ve been meaning to create a website about good network design geared at home users, and how to make their home network suck less. Strategies on how to run ethernet, whether easy mode or hard mode and the benefits of each, even a subsection on options for renters to do it. Explanations on why your wifi probably sucks and how to improve the situation, etc. Wiki style.
It’s a lot of writing and I know a lot about the subject matter, so each article would have a companion deep dive article so you can go deeper into why this or that affects your wifi or home network in the way that it does… Etc. But also have practical tips to avoid problems in the main article. It’s an entire concept that I feel like the internet needs, but I have yet to find anyone publishing that kind of content.
Sure, you can get a bunch of technical detail on Wikipedia, but it doesn’t really explain the real world consequences of things, just how they work. That’s fine if you have an understanding of how things interact, but if you’re new, it’s just a confusing mess.
I just want to educate people into building better home networks, spending more (and less) money and why you would want to do that, where and when… How to use planning software, etc. Basically, I want to take the entirety of my wireless knowledge and put it out there for anyone to browse and learn from.
Yeah they are all double $$ and I recreated the container several times. The only thing that was changed in the compose file was the hash string - user:password.
I read from another older post that sometimes you need to clear all cookies in the browser. Did that also. Didn’t work.
I did however do the DNS challange again as I fucked up the older working config. The cert is different now but points to the same domain and subdomain. Can it be that the browser or traefik are still “remembering” the old cert, with other credentials?
Just grasping at straws here. I’m at a loss.
I can’t find anything online, traefik website or on YouTube about changing passwords. Only create one on a new install.
Should be a walk in the park to change a password. Wtf.
A Plan for Social Media - Rethinking Federation (raphael.lullis.net)
New Windows AI feature records everything you’ve done on your PC (arstechnica.com)
Ask HN: Can we create a new internet where search engines are irrelevant? (news.ycombinator.com)
We have to stop ignoring AI’s hallucination problem (www.theverge.com)
OpenAI strikes Reddit deal to train its AI on your posts (www.theverge.com)
Secure portal between Internet and internal services
I thought I was going to use Authentik for this purpose but it just seems to redirect to an otherwise Internet accessible page. I’m looking for a way to remotely access my home network at a site like remote.mywebsite.com. I have Nginx proxy forwarding with SSL working appropriately, so I need an internal service that receives...
Linux maintainers were infected for 2 years by SSH-dwelling backdoor with huge reach (arstechnica.com)
Someone purchased the old domain of a FOSS app, then it's using it to deceive users to download adware (feddit.it)
One of those two sites is distributing adware. Which of them?...
What it's like to be a developer in 2024 (sopuli.xyz)
Source
Google is redesigning its search engine — and it’s AI all the way down (www.theverge.com)
For security reasons (lemmy.world)
https://lemmy.world/pictrs/image/cd29e707-8f43-4511-afc6-0a778fe36a61.jpeg...
Zuckerberg meme (programming.dev)
deleted_by_author
Moving to a Linux distro for dev
I’m a long time Windows user who has experience with WSL. Last year, I needed a laptop for university, and out of laziness, opted for a Macbook since, although they’re expensive as hell, are reasonably reliable....
Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT (www.tomshardware.com)
What is your little slice of the internet that you own/maintain?
youtube.com/playlist?list=PLopDcjdhYBYkHE_nqV7QNz…...
How do I change Traefik (v2.11) password to the dashboard.
I have forgot the password to the dashboard and want to change it....
Reverse proxy
I have an openwrt router at home which also acts as my home server. It’s running a bunch of services using docker (Jellyfin, Nextcloud, etc.)...