Ansible everything and automate as you go. It is slower, but if it's not your first time setting something up it's not too bad. Right now I literally couldn't care less if the SD on one of my raspberry pi's dies. Or my monitoring backend needs to be reinstalled.
IMO ansible is over kill for my homelab. All of my docker containers live on two servers. One remote and one at home. Both are built with docker compose and are backed up along with their data weekly to both servers and third party cloud backup. In the event one of them fails I have two copies of the data and could have everything back up and running in under 30 minutes.
I also don’t like that Ansible is owned by RedHat. They’ve shown recently they have zero care for their users.
Converting my environment to be mostly containerized was a bit of a slow process that taught me a lot, but now I can try out new applications and configurations at such an accelerated rate it's crazy. Once I got the hang of Docker (and Ansible) it became so easy to try new things, tear them down and try again. Moving services around, backing up or restoring data is way easier.
I can't overstate how impactful containerization has been to my self hosting workflow.
I'm mostly docker. I want to selfhost Lemmy but there's no one-click Docker Compsoe / Portainer installer yet (for Swag / Nginx proxy manager) so I won't until it's ready
Same for me. I've known about Docker for many years now but never understood why I would want to use it when I can just as easily install things directly and just never touch them. Then I ran into dependency problems where two pieces of software required different versions of the same library. Docker just made this problem completely trivial.
Same, but I've never once touched Docker and am doing everything old skool on top of Proxmox. Others may or may not like this approach, but it has many of the benefits in terms of productivity (ease of experimentation, migration, upgrade etc)
I wouldn't change anything, I like fixing things as I go. Doing things right the first time is only nice when I know exactly what I'm doing!
That being said, in my current enviroment, I made a mistake when I discovered docker compose. I saw how wonderfully simply it made deployment and helped with version control and decided to dump every single service into one singular docker-compose.yaml. I would separate services next time into at least their relevant categories for ease of making changes later.
Better yet I would automate deployment with Ansible... But that's my next step in learning and I can fix both mistakes while I go next time!
If you have relevant containers (e.g. the *arr stack) then you can bring all of them up with a single docker compose command (or pull fresh versions etc.). If everything is in a single file then you have to manually pull/start/stop each container or else you have to do it to everything at once.
Go with used & refurb business PCs right out of the gate instead of fucking around with SBCs like the Pi.
Go with "1-liter" aka Ultra Small Form Factor right away instead of starting with SFF. (I don't have a permanent residence at the moment so this makes sense for me)
Docker compose helped me get started with containers but I kept having to push out new config files and manually cycle services. Now I have Ansible roles that can configure and deploy apps from scratch without me even needing to back up config files at all.
Most of my documentation has gone away entirely, I don't need to remember things when they are defined in code.
Uniform folder layout for everything (my first couple of servers were a bit wild-westy)
Choosing and utilizing some reasonable method of assigning ports to things. I do not even want to explain what I need to do when I forget what port something in this setup is using.
I already did a few months ago. My setup was a mess, everything tacked on the host OS, some stuff installed directly, others as docker, firewall was just a bunch of hand-written iptables rules...
I got a newer motherboard and CPU to replace my ageing i5-2500K, so I decided to start from scratch.
First order of business: Something to manage VMs and containers. Second: a decent firewall. Third: One app, one container.
I ended up with:
Proxmox as VM and container manager
OPNSense as firewall. Server has 3 network cards (1 built-in, 2 on PCIe slots), the 2 add-ons are passed through to OPNSense, the built in is for managing Proxmox and for the containers .
A whole bunch of LXC containers running all sorts of stuff.
Things look a lot more professional and clean, and it's all much easier to manage.
Can't say anything about CUDA because I don't have Nvidia cards nor do I work with AI stuff, but I was able to pass the built-in GPU on my Ryzen 2600G to the Jellyfin container so it could do hardware transcoding of videos.
You need the drivers for the GPU installed on the host OS, then link the devices on /dev to the container. For AMD this is easy, bc the drivers are open source and included in the distro (Proxmox is Debian based), for Nvidia you'd have to deal with the proprietary stuff both on the host and on the containers.
Yes, you can pass through any GPU to containers pretty easily, and if you are starting with a new VM you can also pass through easily there, but if you are trying to use an existing VM you can run into problems.
Yes, but you'll be wishing you had 8 bays when you fill the 6 :) At some point, you have to replace disks to really increase space, don't make your RAID volumes consist of more disks than you can reasonably afford to replace at one time. Second lesson, if you have spare drive bays, use them as part of your upgrade strategy, not as additional storage. Started this last iteration with 6x3tb drives in a raidz2 vdev, opted to add another 6x3tb vdev instead of biting the bullet and upgrading. To add more storage I need to replace 6 drives. Instead I built a second NAS to backup the primary and am pulling all 12 disks and dropping back to 6. If/when I increase storage, I'll drop 6 new ones in and MOVE the data instead of adding capacity.
Actually plan things and research. Too many of my decisions come back to bite me because I don't plan out stuff like networking, resources, hard drive layouts..
My current homelab is running on a single Dell R720xd with 12x6TB SAS HDDs. I have ESXi as the hypervisor with a pfsense gateway and a trueNAS core vm. It's compact, has lots of redundancy, can run everything I want and more, has IPMI, and ECC RAM. Great, right?
Well, it sucks back about 300w at idle, sounds like a jet engine all the time, and having everything on one machine is fragile as hell.
Not to mention the Aruba Networks switch and Eaton UPS that are also loud.
I had to beg my dad to let it live at his house because no matter what I did: custom fan curves, better c-state management, a custom enclosure with sound isolation and ducting, I could not dump heat fast enough to make it quiet and it was driving me mad.
I'm in the process of doing it better. I'm going to build a small NAS using consumer hardware and big, quiet fans, I have a fanless N6005 box as a gateway, and I'm going to convert my old gaming machine to a hypervisor using proxmox, with each VM managed with either docker-compose, Ansible, or nixOS.
I’ve had an R710 at the foot of my bed for the past 4 years and only decommissioned it a couple of months ago. I haven’t configured anything but I don’t really notice the noise. I can tell that it’s there but only when I listen for it. Different people are bothered by different sounds maybe?
Wireguard is super quick and easy to setup and use, I'd highly recommend to do that now. I don't understand the recent obsession with Tailscale apart from bypassing cgNAT
Tailscale is an abstraction layer built on top of Wireguard. It handles things like assigning IP addresses, sharing public kegs, and building a mesh network without you having to do any manual work. People like easy solutions, which is why it's popular.
To manually build a mesh with Wireguard, every node needs to have every other node listed as a peer in their config. I've done this manually before, or you could automate it (eg using Ansible or a tool specifically for Wireguard meshes). With Tailscale, you just log in using one of their client apps, and everything just works automatically.
I don't think there's any significant downsides. I suppose you are dependent on their infrastructure and uptime. If they ever go down, or for any reason stop offering their services, then you're out of luck. But yeah that's not significant.
The reason I want to do this is it gives me more control over the setup in case I ever wanted to customize it or the wireguard config, and also teaches me more in general, which will enable me to better debug.
I'd use Terraform and Ansible from the start. I'm slowly migrating my current setup to these tools, but that's obviously harder than starting from scratch. At least I did document everything in some way. That documentation plus state on the server is definitely enough to do this transition.
Not accidentally buy a server that takes 2.5 inch hard drives. Currently I'm using some of the ones it came with and 2 WD Red drives that I just have sitting on top of the server with SATA extension cables going down to the server.
I already have to do it every now and then, because I insisted on buying bare metal servers (at scale way) rather than VMs. These things die very abruptly, and I learnt the hard way how important are backups and config management systems.
If I had to redo EVERYTHING, I would use terraform to provision servers, and go with a "backup, automate and deploy" approach. Documentation would be a plus, but with the config management I feel like I don't need it anymore.
What's the point on a rented VPS? The provider can just dump the decryption key from RAM.
bare metal servers (at scale way) rather than VMs. These things die very abruptly
Had this happen to me with two Dedibox (scaleway) servers over a few months (I had backups, no big deal but annoying). wtf do they do with their machines to burn through them at this rate??
I don't know if they can "just" dump the key from RAM on a bare metal server. Nevertheless, it covers my ass when they retire the server after I used it.
And yeah I've had quite a few servers die on me (usually the hard drive). At this point I'm wondering if it isn't scheduled obsolescence to force you into buying their new hardware every now and then. Regardless, I'm slowly moving off scaleway as their support is now mediocre in these cases, and their cheapest servers don't support console access anymore, which means you're bound to using their distro.
Terraform is the only missing brick in my case, but that's also because I still rent real hardware :)
I'm not fond of my backup system tho, it works, but it's not included in the automated configuration of each service, which is not ideal IMO.
That's a pretty good question: Since I am new-ish to the self-hosting realm, I don't think I would have replaced my consumer router with the Dell OptiPlex 7050 that I decided on. Of course this does make things very secure considering my router is powered by OpenBSD. Originally, I was just participating in DN42 which is one giant VPN semi-mesh network. Out of that hatched the idea to yank stuff out of the cloud. Instead, I would have put the money towards building a dedicated server instead of using my desktop as a server. At the time I didn't realize how cheap older Xeon processors are. I could have cobbled together a powerhouse multi-core, multi-threaded Proxmox or xcp-ng server for maybe around 500-600 bucks. Oh well, lesson learned.
I would’ve gone with a less powerful nas and got a separate unit for compute. I got a synology nas with a decent amount of compute so I could run all my stuff on the nas, and the proprietary locked down OS drives me a bit nuts. Causes all sorts of issues. If I had a separate compute box I could just be running some flavor of Linux, probably Ubuntu and have things behave much more nicely
I'd plan out what machines do what according to their drive sizes rather than finding out the hard way that one of them only has a few GB spare that I used as a mail server. Certainly document what I have going, if my machine Francesco explodes one day it'll take months to remember what was actually running on it.
I'd also not risk years of data on a single SSD drive that just stopped functioning for my "NAS" (its not really a true NAS just a shitty drive with a terabyte) and have a better backup plan
I have ended up with 6x 2TB disks, so if I was starting again I'd go 2x10TB and use an IT mode HBA and software RAID 1. I'd also replace my 2x Netgear Switches and 1x basic smart TP-Link switch and go full TP-Link Omada for switching with POE ports on 2 of them - I have an Omada WAP and it's very good. Otherwise I'm pretty happy.
I have things scattered around different machines (a hangover from my previous network configuration that was running off two separate routers) so I’d probably look to have everything on one machine.
Also I kind of rushed setting up my Dell server and I never really paid any attention to how it was set up for RAID. I also currently have everything running on separate VMs rather than in containers.
I may at some point copy the important stuff off my server and set it up from scratch.
I may also move from using a load balancer to manage incoming connections to doing it via Cloudflare Tunnels.
The thing is there’s always something to tinker with and I’ve learnt a lot building my little home lab. There’s always something new to play around with and learn.
Is my setup optimal? Hell no. Does it work? Yep. 🙂
To be honest, nothing.
Running my home server on a nuc with proxmox and a 8 bay synology Nas (though I'm glad that I went with 8 bay back then!).
As a router I have opnsense running on a low powered mini pc.
All in all I couldn't wish for more (low power, high performance, easy to maintain) for my use case, but I'll soon need some storage and ram upgrade on the proxmox server.
I recently did this for the second time. Started on FreeNAS, switched to TrueNAS Scale when it released and just switched to Debian. Scale was too reliant on TrueCharts which would break and require a fresh install every couple of months. I should've just started with Debian in the first place.
Getting a better rack. My 60cm deep rack with a bunch of rack shelves and no cable management is not very pretty and moving servers around is pretty hard.
Hardwarewise I'm mostly fine with it, although I would use a platform with IPMI instead of AM4 for my hypervisor.
The only real pain point I have is my hard drive layout. I've got a bunch of different drive sizes that are hard to expand on without wasting space or spending a ton.
I would go smaller with lower power hardware. I currently have Proxmox running on an r530 for my VMs, plus an external NAS for all my storage. I feel like I could run a few 7050 micro's together with proxmox and downsize my NAS to use less but higher density disks.
Also, having a 42U rack makes me want to fill it up with UPS's and lots of backup options that could be simplified if I took the time to not frankenstein my solutions in there. But, here we are...
I built a compact nas. While it's enough for the drives I need, even for upgrades, I only have 1 pcie x4 slot. Which is becoming a bit limiting. I didn't think i'd have a need for for either a tape drive or a graphics card, and I have some things I want to do that require both. Well, I can only do one unless I get a different motherboard and case. Which means i'm basically doing a new build and I don't want to do either of the projects I had in mind enough to bother with that.
Use actual nas drives. Do not use shucked external drives, they are cheaper for a reason, not meant for 24-7. Though I guess they did get me through a couple years, and hard drive prices seem to keep falling.