Skip Navigation

Distributed hosting?

Hey all. Not sure if this is the right place to post this, please point me in the right direction if not:

So I only came here because of the exodus from reddit, but I'm pumped to see this community and all this technology people have been making. It's like a return to the old-school, user-operated internet instead of the big awful silos that have been dominating the landscape since the early 2000s. I'm in.

So quick question, are there plans or projects in the works for distributed hosting (making it easier for the users to take up the load of storing and hosting content so the instance operators aren't stuck with the hosting costs)?

I ask because I'd like to work on a project to implement this, as I feel it'd be a massive further step forward. I'm not sure though if there's anything existing I should be trying to get up to speed on or if I should be thinking in terms of starting my own project if I want to be working on it.

17
17 comments
  • Lemmy is Federated. You don't distribute hosting, you have the federation servers communicate with each other.

    The best thing you can do is spin up your own instance and convince your friends to use it. That way big communities like https://lemmy.ml/c/asklemmy only has to send your server one update for a post for all your users to view, rather than sending that update to 20 browsers themselves.

    So your lemmy.mo_ztt.com instance could serve the one copy of it's content to your dozen or so users which takes load off of the "main" instance.

    "Instance operators" as you termed it... could be literally anyone. You can host is on a raspberry pi for a handful of users easily. This would lighten the loads on the major "Instance operators".

    • What's the right term for what I'm calling an "instance operator"? I realize that anyone could be one, I just need some language to use to distinguish the people who are from the people who aren't.

      • Oh I wasn't chastising your choice in words there. I was just using your term to make it clear that I'm talking about the same thing you were. I also am relatively new here. I'm not sure what the proper lingo is. I would presume "Instance admins", but I can see how that could be vague or also include people who might not be paying for the actual hosting itself.

  • Instances are the way to distribute the load, they are basically acting as a read replica for every thing a user on that instance views. Yes, this may be "inefficient" in terms of storage an instance needs, but it is highly efficient in offloading the burden of a popular post to hundreds of instances instead of tens of thousands of users. Further, this makes the system resilient as every instance has a largely real-time copy of the things their users care about, even if the "origin" instance goes offline.

    • Further, this makes the system resilient as every instance has a largely real-time copy of the things their users care about, even if the “origin” instance goes offline.

      This is also a great point. Lemmy.ml is getting hit hard and having sporadic outages. My instance can continue to serve the items it's received to my users just fine. Effectively no downtime...

  • I don't know if ActivityPub has anything to further distribute beyond instances but what you're talking about reminds me of IPFS and some crypto backed stuff like Filecoin.

    • Yah, I'm asking because I have a specific (handwave-y) solution in mind using, among other things, IPFS. I'm not too much up to speed on Lemmy's internals so my solution probably needs big adjustment before it'd be realistic. I planned to make a separate post where I talk at more length on why I think this is needed and some of the ideas I had about solutions; this post was just to get some idea of how the community looks at the issue.

      • I've had similar musings as yours I think. I think the way to make a decentralized community as user friendly as a centralized one would be making the decentralization transparent somehow. One way would require a way for hosters to volunteer computing resources in a way that's more like adding cattle to a herd rather than pets to a family like in fediverse/matrix/email. More ephemeral and happening in the background. I think the downside is that this is getting closer to peer-to-peer which has a lot of overhead and scaling issues (factorial growth). Federation lies between p2p and client-server but maybe there is room to push it closer to p2p to unlock transparent distribution of resources.

  • I have been thinking about this a bit. Right now there is not really a way to spread the load out like you mentioned. Anyone can make another instance, but it doesn't really alleviate any of the stress from another instance. I think it might even add to it, although not as much as adding a bunch of new users would. It would be beneficial to be able to contribute compute power to an instance, but I don't think that is a realistic goal with the way Lemmy is setup.

    • Anyone can make another instance, but it doesn’t really alleviate any of the stress from another instance.

      This is inaccurate. If you run your own instance... and have 20 users. That's 20 users that aren't hitting the main instance. One copy of the content is transmitted from the primary instance to your instance... Those 20 users are then hitting your instance. So instead of the main instance serving 20 people it's serving to one copy of the content. That is a 20 fold savings in bandwidth, cpu, and ram. The only thing that isn't saved is disk capacity... since the origin server needs to serve all the content on demand.

      Now the 1-2 user instances, yes there's not much savings there. But once you get to 5-10 it's already a better deal.

    • Right, any way you slice it, if you have a reddit-scale operation where the content is served entirely by the instances, then the people who run the instances are paying a reddit-scale hosting bill in aggregate. I saw one estimate that Reddit paid about half a million dollars in hosting bills per month. You hit the nail on the head -- adding a hobbyist who's running their own instance for themselves and maybe a handful of people, does nothing to reduce the load on the big instances. How many of those big instances are there going to be if Lemmy grows to reddit size? Enough to break that half-million dollar aggregate hosting bill into manageable pieces? Probably not. At that point you can't do it just with hobbyists with their home machines on static IPs anymore.

      Or, actually, you can, if you architect the system to make proper use of the hobbyists' hardware. Obviously there are solutions; what I'm envisioning is a browser plugin that enables someone browsing Lemmy to pull content from the hobbyists even when talking to the big instances (basically decouple "I run an instance" from "I have to pay all the hosting costs for every byte that's served to someone browsing on that instance" and shift some of the load onto the people who are more in a hobbyist role and aren't paying for any kind of official hosting but can still send bytes). I have a lot more thoughts on the topic and more full ideas about how it might be solved, I was just trying to get a sense of what the community's thoughts on it are also.

    • I fleshed out one proposal for a solution which I'm planning to start working on.

17 comments