For this new year, I’d like to learn the skills necessary to self host. Specifically, I would like to eventually be able to self host Nextcloud, Jellyfin and possibly my email server too.
I’ve have a basic level understanding of Python and Kotlin. Now I’m in the process of learning Linux through a virtual machine because I know Linux is better suited for self hosting.
Should I stick with Python? Or is JavaScript (or maybe Ruby) better suited for that purpose? I’m more than happy to learn a new language, but I’m unsure on which is better suited.
And if you could start again in your self hosting journey, what would you do differently? :)
EDIT: I wasn’t expecting all these wonderful replies. You’re all very kind people to share so much with me :)
The consensus seems to be that hosting your own email server might be a lot, so I might leave that as future project. But for Nextcloud and Jellyfin I saw a lot of great tips! I forgot to mention that ideally I would like to have Nextcloud available for multiple users (ie. family memebers) so indeed learning some basic networking/firewalling seems the bare minimum.
I also promise that I will carefully read the manuals!
Patience, most of all.
Also, backups and notes. The solution you use to host might take care of the backups. For example, I use Unraid, so if any drive fails the system can simulate the data on that drive until I can get it shut down to replace it, and then recreate the data on the new drive.
As for notes, those are important so that you can always know what you’ve done, and what you need to do. That way, if you ever have to do it again, say if you’re setting up another server or replacing one that failed, you know the steps you took to get it set up exactly how you like. It’s also handy because you’ll be doing things like assigning services to ports, and you’ll probably at some point want to know what services are on what ports without going through and checking each one. Things like that are handy things to stick in notes.
Other than that, you don’t need a lot of skills to set something like a home server up. You just need to read the documentation for each service you’re planning to use, and get familiar with how it works.
Lots of people have been talking about products and tools. It’s docker, tailscale, cloudflare proxmox etc. These are important, but will likely come and go on a long enough timescale.
In terms of actual skills, there’s two that will dramatically decrease your headaches. Documention and backup planning. The problem with developing those skills is, to my knowledge, they’ve only ever been obtained through suffering. Trying to remember how to rebuild something when you built it 6 months ago is futile. Trying to recover borked data is brutal. There’s no fail-safe that you haven’t created, and there’s no history that you haven’t written. Fortunately, these are also the most transferable skills.
My advice is, jump in. Don’t hesitate. The chops in docker/linux/networking will come with use and familiarity. If it looks cool, do it. Make mistakes. You will rapidly realise what the problems with your set up are. You will gain knowledge in leaps and bounds from breaking a thing vs learning by rote or lesson. Reframe the headaches as a feature, not a bug - they’re highlighting holes in your understanding. They signpost the way to being a better tech, and a more stable production environment.
The greatest bit about self hosting for me is planning the next great leap forward, making it better, cleaner, more robust. Growing the confidence in your abilities to create a system you can trust. Honing your skills and toolset is the entirety of the excercise, so jump in, and don’t focus on any one thing to master or practice before hand!
Networking is way more important than pretty much anything else. TCP/IP and http are going to stay for quite a while.
You don’t need to be a programmer to selfhost.
The most important “skills” to have if you want to selfhost imo are:
-
Basic Networking knowledge
-
Basic Linux knowledge
-
Basic docker/docker compose knowledge
But I’d say to not get lost in the papers and just jump right in. Imo, the best way to learn how to selfhost is to just… Do it. Most everything is free and fairly well documented
-
Learning Linux is a great start.
Learning any coding language will help you understand a bit more about the programs will work, however there isn’t much need to actually learn a specific language unless you plan to add custom programs or scripts.
The general advice for email is don’t. It’s very risky to host and it’s a big target for spam. Plus there’s challenges getting the big companies to trust your domain.
However hosting things behind a VPN (or locally on your home network) can let you learn a lot about networking and firewalls without exposing yourself to much risk.
I have no direct experience with next cloud but I understand it can be hosted on Linux, you can buy a Synology NAS and run it in that, or use something like TrueNAS.
Personally my setup is on one physical server so I use Proxmox which lets me run 2 different Linux servers and trueNAS on one single computer through virtual machines. I like it because it lets me tinker with different stuff like home assistant and it won’t affect say my adblocker/VPN/reverse proxy. I also use Docker to run multiple services on one virtual machine without compatibility issues. If I started again, I’d probably have gotten bigger drives or invested in SSDs. My NAS is hard drives because of cost but it’s definitely hitting a limit when I need to pull a bunch of files. Super happy with wireguard-easy for VPN. I started with a proprietary version of openVPN on Oracle Linux and that was a mistake.
Docker really. If something goes bad, trash the container and start again without loosing your actual data.
Why are you wanting to use python for self hosting? Python is a programming and scripting language.
There are two big things I would focus on. The first thing is networking, the OSI model and http basics. The other thing I would look into is Linux containers. If you can get both of these you are golden. Learn how to use and write docker compose files and then looking into building your own containers with Dockerfiles. You don’t really need to build your own containers but it is good for learning
If you have a VM, there is no need for docker. Start by installing ssh. Enable public key auth. Disable password authentication. Set up fail2ban with ssh. Set up ufw. Set up nextcloud. Avoid hosting your own mail, that’s another level of complexity. If you really need it, try mailcow.
If you have all that and didn’t touch a GUI on your way, you’re good to go.
I would not run anything outside of docker honestly. Docker is so much easier to setup and maintain.
Documentation has been mentioned already, what I’d add to that is planning.
Start with a list of high-level objectives, as in “Need a way to save notes, ideas, documents, between multiple systems, including mobile devices”.
Then break that down to high-level requirements such as “Implement Joplin, and a sync solution”.
Those high-level requirements then spawn system requirements, such as Joplin needs X disk space, user accounts, etc.
Each of those branches out to technical requirements, which are single-line, single-task descriptions (you can skip this, it’s a nice-to-have):
“Create folder Joplin on server A”
“Set folder permissions XYZ on Joplin folder”
Think of it all as a tree, starting from your objectives. If you document it like this first, you won’t go doing something as you build that you won’t remember why you’re doing it, or make decisions on the fly that conflict with other objectives.
This is really smart actually
Enough focus to read documentation.
That’s really it. If your purpose is just self hosting learning bash could also be helpful. And yeah Linux would be a great choice.
But mostly, if you want to self host an instance of Nextcloud correctly and without having to deal with too many unexpected things, you have to read the documentation and do not rush. Most self hosted stuff isn’t “install and use”, because you’ll be your own server manager, and everything requires attention to be managed.
Docker or not docker you will have to deal with configuration, settings, requirements and updates.
So understanding how to read the docs/search and open github issues and taking time to read everything would be the most important skill for me.
Also writing down what you are doing would indeed be helpful too, in order not to lose track of what you’re doing on your server. (Check out Ansible).
Most apps out there simply need you to know about permissions, systemctl services and package managers.
Try to always find a specific package for your distro for everything you install (eg. .deb for Debian), and have strategies when this is not possible (aka using a Python venv when installing python programs).
Ansible will be really hard without Linux knowledge
Maybe do that later
If you want to program something, the closest you’re gonna get to programming is Ansible and Bash scripts.
You might want to get self hosting hardware like Synology or the like if you’re not ready to dig.
Otherwise here’s some things you need to know:
- Docker
- Easy, consistent deployment of services in their own environments. Think a VM but with almost no overhead.
- Docker Compose
- Run docker containers with consistent configuration in files.
- Connect various containers to each other on the same or different networks.
- Get multiple containers to start together and talk to each other.
- Systemd
- Manage any service on Linux. If anything needs to start on boot, restart when crashed, start on timer, you want Systemd.
- You can manage your docker compose containers lifecycle via Systemd.
- NGINX/Apache/Caddy
- A web server for reverse proxy. You’d probably need one at some point, especially if you want HTTPS. Your services get hidden behind it.
- ZFS
- Reliable redundant storage. You’ll need storage. Use ZFS with 2-disk redundancy.
- Supports automatic snapshots for recovering from oopsies. E.g. deleted something or some software shat on your data.
- Can use recertified disks from serverpartsdeals.
- Can use USB disks or USB box with multiple disks. If you end up going the USB route, ask me for tested hardware.
- Backup system
- Something to do backup. There are many options.
- Ansible
- If you want to write code that describes your services and make them happen, you want Ansible. You write code (well YAML) and Ansible installs things, writes config files, sets up Systemd services, restarts things. It can be convenient especially if you have a lot of stuff and you want to be able to see all of your infrastructure in code in one place and be able to version it.
- Prometheus
- Monitoring your stuff. Is my backup service running? If not send me an email.
Oh and use Debian or Ubuntu LTS.
Great summary!
Why Debian or Ubuntu? (I have my own thoughts, but it would be useful to show even high-level reasons why they’re preferred).
Re: Backup - Backblaze has a great writeup on backup approach today. I’m a fan of cloud being part of the mix (I use a combo of local replication and cloud, to mitigate different risks). Getting people to include backup from the start will help them long-term, so great you included it!
Predictable cadence, stable operation, timely updates, huge community and therefore documentation. You can get up to 5 years from an LTS release of Debian or Ubuntu. With Ubuntu LTS and Ubuntu Pro (free) you could theoretically run a machine without upgrading for 10 years. If you run workloads in containers, it doesn’t matter how old the host OS is. As long as it’s security patches, you can keep on trucking.
Damn, 5 years from LTS? That’s impressive
- Docker
Learn how to properly backup your data in case you nuke something you shouldn’t
And regularly check them. I just found out the hard way this last week that my backups haven’t been running for a few weeks …
Yep.
I have friends in the SMB space, one thing they do is a regular backup verification (quarterly). At that frequency, restoring even a few files (especially to a new VM), is very indicative, especially if it’s a large dataset (e.g. Quickbooks).
In Enterprise, we do all sorts of validation, depending on the system. Some is performed as part of Data Center operations, some is by IT (those are separate things), some by Business Unit management and their IT counterparts.
Unfortunately, that wouldn’t have done anything. Because I did that in December and they stopped running like 2 weeks after my verification. I would have caught it on my next scheduled validation, but that doesn’t help me now 😕
I mean, it still helps right? It limits your losses to X weeks instead of X months or, I hate to say it, X years.
No special knowledge needed except the very basic ability to understand and run commands from documentation.
Setting up jellyfin, I used docker on debian, and an old Quadro card. What could possibly go wrong?
Turns out that week the Nvidia drivers got a faulty update pushed to debian stable and caused an error with getting the GPU to work in any container. I could either wait a week or pull the simple fix from testing. So impatiently I pulled it from testing.
Why didn’t you do a rollback?