Showing posts for tag "linux"

Homelab Rework: Phase 3 - TrueNAS Core to Scale

Sat Mar 02 14:00:08 EST 2024

  1. Planning a Homelab Rework
  2. Homelab Rework: Phase 1
  3. Homelab Rework: Phase 2
  4. Homelab Rework: Phase 3 - TrueNAS Core to Scale

When I last talked about the ragtag fleet of computers I generously call a "homelab" now, I had converted my gaming/VM machine back from Proxmox to Windows, where it remains (successfully) to this day.

For a while, though, I've been eyeing converting my NAS from TrueNAS Core to Scale. While I really like FreeBSD technically and philosophically, running Linux was very appealing for a number of reasons. Still, it was a high-risk operation, even though the actual process of migration looked almost impossibly easy. For some reason, I decided to take the plunge this week.

The Setup

Before going into the actual process, I'll describe the setup a bit. The machine in question is a Mac Pro 1,1: two Xeon 5150s, four traditional HDDs for storage, and a handful of M.2 drives. The machine itself is far, far too old to have NVMe on the motherboard, but it does have PCIe, so I got a couple adapter cards. The boot volume is a SATA M.2 disk on one of them, while I have some actual NVMe ones serving as cache/log devices in the ZFS pool. Also, though everything says that the maximum RAM capacity is 32 GB, I actually have 64 in there and it's worked perfectly.

It's a bit of a weird beast this way, but those old Mac Pros were built to last, and it's holding up.

Also, if you're not familiar with TrueNAS and its different variants, it's worth a bit of explanation. TrueNAS Core (née FreeNAS) is a FreeBSD-based NAS-focused OS. You primarily interact with it via a web-based GUI and its various features heavily revolve around the use of ZFS, while its app system uses FreeBSD jails and its VM system uses Bhyve. TrueNAS Scale is a related system, but based on Debian Linux instead of FreeBSD. It still uses ZFS, and its GUI is similar to Core, but it implements its apps and VMs differently (more on this in a bit). For NAS/file-share uses, there's actually less of a difference than you might think based on their different underlying OSes, but the distinctions come into play once you go beyond the basics.

The Conversion

If anything, the above-linked documentation overstates the complexity of the operation. I didn't even need to go the "manual update" route: I went to the Update panel, switched from the TrueNAS Core train to the current non-beta TrueNAS Scale one, hit Update, and let it go. It took a long time, presumably due to the age of the machine, but it did its job and came back up on its own.

Well, mostly: for some reason, the actual data ZFS pool was sort of half-detached. The OS knew it was supposed to have a pool by its name, but didn't match it up to the existing disks. To fix this, I deleted the configuration for the pool (but did not delete the connected service configuration) and then went to Import Pool, where the real one existed. Once it was imported, everything lined back up without further issue.

Being basically a completely-different OS, there are a number of features that Core supports but Scale doesn't. Of that list, the only one I was using was the plugin/jail system, but I had whittled my use down to just Postgres (containing only discardable dev data) and Plex. These are both readily available in Scale's app system, and it was quick enough to get Plex re-set-up with the same library data.

Apps

As I mentioned, TrueNAS Core uses a custom-built "plugin" system sitting on top of the venerable FreeBSD jail capabilities. Those jails are similar in concept to things like Docker containers, and work very similarly in practice to the Linux Containers system I experienced with Proxmox.

TrueNAS Scale, for its part, uses Kubernetes, specifically by way of K3s, and provides its own convenient UI on top of it. Good thing it does provide this UI, too, since Kubernetes is a whole freaking thing, and I've up until this point stayed away from learning it. I guess my time has come, though. Kubernetes is distinct from Docker - while older versions used Docker as a runtime of sorts, this was always an implementation detail, and the system in use in current TrueNAS Scale is containerd.

Setting aside the conceptual complexity of Kubernetes, this distinction from Core is handy: while not being Docker, Kubernetes can consume Docker-compatible images and run them, and that ecosystem is huge. Additionally, while TrueNAS ships with a set of common app "charts" (Plex included), there's a community project named TrueCharts that adds definitions for tons and tons more.

Domino

That brings me to our beloved Domino. I had actually kind of gotten Domino running in a jail on TrueNAS Core, but it was much more an exercise in seeing if I could do it than anything useful: the installer didn't run, so I had to copy an installation from elsewhere, and the JVM wouldn't even load up without crashing. Neat to see, but I didn't keep it around.

The prospect on Scale is better, though. For one, it's actually Linux and thus doesn't need a binary-compatibility shim like FreeBSD has, and the container runtime meant I could presumably just use the normal image-building process. I could also run it in a VM, since the Linux hypervisor works on this machine while bhyve did not, but I figured I'd give the container path a shot.

Before I go any further, I'll give a huge caveat: while this works better than running it on FreeBSD, I wouldn't recommend actually doing what I've done for production. It'll presumably do what I want it to do here (be a local replica of all of my DBs without requiring a distinct VM), it's not ideal. For one, Domino plus Kubernetes is a weird mix: Kubernetes is all about building up and tearing down a swarm of containers dynamically, while Domino is much more of a single-server sort of thing. It works, certainly, but Kubernetes is always there to tempt you into doing things weird. Also, I know almost nothing about Kubernetes anyway, so don't take anything I say here as advice. It's good fun, though.

That said, on to the specifics!

Deploying the Container

The way the TrueNAS app UI works, you can go to "Custom App" and configure your container by referencing a Docker image from a repository. I don't normally actually host a Docker registry, instead manually loading the image into the runtime. It might be possible to do that here, but I took the opportunity to set up a quick local-network-only one on my other machine, both because I figured it'd be neat to learn to do that and because I forgot about the Harbor-hosted option on that link.

Since the local registry used HTTP and there's nowhere in the TrueNAS UI to tell it to not use HTTPS, I followed this suggestion to configure K3s to explicitly map it. With that in place, I was able to start pulling images from my registry.

The Domino Version

One quirk I quickly ran into was that I can't use Domino 14 on here. The reason for this isn't an OS problem, but rather a hardware limitation: the new glibc that Domino 14 uses requires the "x86-64-v2" microarchitecture level and the Xeon 5150 just doesn't have that by dint of pre-dating it by two years.

That's fine, though: I really just want this to house data, not app development, and 12.0.2 will do that with aplomb.

Volume Configuration

The way I usually set up a Domino container when using e.g. Docker Compose is that I define a handful of volumes to go with it: for the normal data dir, for DAOS, for the Transaction Log, and so forth. This is a bit of an affectation, I suppose, since I could also just define one volume for everything and it's not like I actually host these volumes elsewhere, but... I don't know, I like it. It keeps things disciplined.

Anyway, I originally set this up equivalently in the Custom App UI in TrueNAS, creating a "Volume" entry for each of these. However, I found that, for some reason, Domino didn't have write access to the newly-created volumes. Maybe this is due to the uid the container is built to use or something, but I worked around it by using Host Path Volumes instead. The net effect is the same, since they're in the same ZFS pool, and this actually makes it easier to peek at the data anyway, since it can be in the SMB share.

Once I did that and made sure the container user could modify the data, all was well. Mostly, anyway.

Transaction Logs, ZFS, and Sector Size

Once Domino got going, I noticed a minor problem: it would crash quickly, every time. Specifically, it crashed when it started preparing the transaction log directory. I eventually remembered running into the same problem on Proxmox at one point, and it brought me back to this blog post by Ted Hardenburgh. Long story short, my ZFS pool uses 4K sectors and Domino's transaction logs can't deal with that, at least in 12.0.2 and below.

This put me in a bit of a sticky spot, since the way to change this is to re-create the entire pool and I really didn't want to do that.

I came up with a workaround, though, in the form of making a little disk image and formatting it ext4. You can use a loop device to mount a file like a disk, so the process looks like this:

1
2
3
4
dd if=/dev/zero of=tlog.img bs=1G count=1
sudo /sbin/losetup --find --show tlog.img
sudo mkfs.ext4 /dev/loop0
sudo mount /dev/loop0 /mnt/tlog

That makes a 1GB disk image, formats it ext4, and mounts it as "/mnt/tlog". This process defaults to 512-byte sectors, so I made a directory within it writable by the container user (more on this shortly), configured the Domino container to map the transaction log directory to that path, and all was well.

Normally, to get this mounted at boot, you'd likely put an entry in fstab. However, TrueNAS assumes control over system configuration files like that, and you shouldn't edit them directly. Instead, what I did was write a small script that does the losetup and mount lines above and added an entry in "System Settings" - "Advanced" - "Init/Shutdown Scripts" to run this at pre-init.

Networking

The next hurdle I wanted to get over was the networking side. You can map ports in apps in a similar way to what you'd do with Docker, but you have to map them to a port 9000 or above. That would be an annoying issue in general, but especially for NRPC. Fortunately, the app configuration allows you give the container its own IP address in the "Add external Interfaces" (sic) configuration section. Since the virtual MAC address changes each time the container is deployed, I gave it a static IP address matching a reservation I carved out on my DHCP server, pointed it to the DNS server, and all was well. All of Domino's open ports are available on that IP, and it's basically like a VM in that way.

Container User

Normally, containers in TrueNAS's app system run as the "apps" user, though this is configurable per-app. The way the Domino container launches, though, it runs as UID 1000, which is notes inside the container. Outside the container, on my setup, that ID maps to... my user account jesse.

Administration-wise, that's not exactly the best! In a less "for fun" situation, I'd change the container user or look into UID mapping as I've done with Docker in the past, but honestly it's fine here. This means it's easy for me to access and edit Domino data/config files over the share, and it made the volume mapping above work without incident. As long as no admins find out about this, it can be my secret shame.

Future Uses

So, at this point, the server is doing the jobs it was doing previously, plus acting as a nice extra replica server for Domino. It's positioned well now for me to do a lot of other tinkering.

For one, it'll be a good opportunity for me to finally learn about Kubernetes, which I've been dragging my feet on. I installed the Portainer chart from TrueCharts to give me a look into the K8s layer in a way that's less abstracted than the TrueNAS UI but more familiar and comfortable than the kubectl tool for me for now.

Additionally, since the hypervisor works on here, it'll be another good location for me to store utility VMs when I need them, rather than putting everything on the Windows machine (which has half as much RAM).

I could possibly use it to host servers for various games like Terraria, though I'm a bit wary of throwing such ancient processors at the task. We'll see about that.

In general, I want to try hosting more things from home when they're non-critical, and this will definitely give me the opportunity. It's also quite fun to tinker with, and that's the most important thing.

Homelab Rework: Phase 2

Fri Sep 15 11:39:42 EDT 2023

Tags: homelab linux
  1. Planning a Homelab Rework
  2. Homelab Rework: Phase 1
  3. Homelab Rework: Phase 2
  4. Homelab Rework: Phase 3 - TrueNAS Core to Scale

CollabSphere 2023 came and went the other week, and I have some followup to do from that for sure, not the least of which being the open-sourcing of JNX, but that post will have to wait a bit longer. For now, I'm here to talk about my home servers.

When last I left the topic, I had installed Proxmox as my VM host of choice to replace Windows Server 2019, migrated my existing Hyper-V VMs, and set up a Windows 11 VM with PCIe passthrough for the video card. There were some hoops to jump through, but I got everything working.

Now, though, I've gone back on all of that, or close to it. Why?

Why Did I Go Back On All Of That, Or Close To It?

The core trouble that has dogged me for the last few months is performance. While the host I'm using isn't a top-of-the-line powerhouse (namely, it's using an i7-8700K and generally related-era consumer parts), things were running worse than I was sure they should. My backup-runner Linux VM, which should have been happy as a clam with a Linux host, suffered to the extent that it never actually successfully ran a backup. My Windows dev VMs worked fine, but would periodically just drag when trying to redraw window widgets in a way they hadn't previously. And, most importantly of all, Baldur's Gate 3 exhibited bizarre load-speed problems: the actual graphical performance was great even on the highest settings, but I'd get lags of 10 seconds or so loading assets, much worse than the initial performance trouble reported by others in the release version of the game.

Some of this I chalked up to lack of optimized settings, like how the migrated VMs were using "compatibility" settings instead of all the finest-tuned VirtIO stuff. However, my gaming VM was decked out fully: VirtIO network and disk, highest-capability UEFI BIOS, and so forth. They were all sitting on ZFS across purely-NVMe drives, so they shouldn't have been lacking for disk speed. I tried a bunch of things, like dedicating an SATA SSD to the VM, or passing through a USB 3 SSD, but the result was always the same. Between game updates and re-making the VM in a "lesser" way with Windows 10, I ended up getting okay performance, but the speed of the other VMs bothered me.

Now, I don't want to throw Proxmox specifically or KVM generally under the bus here. It's possible that I could have improved this situation - perhaps, despite my investigations and little tweaks, I had things configured poorly. And, again, this hardware isn't built for the purpose, but instead I was cramming server-type behavior into "prosumer"-at-best hardware. Still, Hyper-V didn't have this trouble, so it nagged at me.

But Also Containers

As I mentioned towards the bottom of the previous post, Proxmox natively uses Linux Containers and not Docker, but I wanted to see what I could do about that. I tried a few things, installing Docker inside an LXC container as well as on the main host OS, but ran into odd filesystem-related problems within Dockerfiles. I found ways to work around those by doing things like deleting just files instead of directory trees, but I didn't want to go and change all my project Dockerfiles just to account for an odd local system. I had previously used my backup-manager VM for Docker, but that VM's performance trouble made me make a new secondary one. That ended up expanding the overhead and RAM consumption, which defeated some of the potential benefits.

Little Things

Beyond that, there were little things that got to me. Though Proxmox is free, it still gives a little nag screen about being unlicensed the first time you visit the web UI each reboot, which is a mild annoyance. Additionally, it doesn't have built-in support for suspending/resuming active VMs when you reboot the machine, as Hyper-V does - I found some people recommending systemd scripts for this, but that would introduce little timing problems that wouldn't arise if it was a standard capability.

There also ended up being a lot that was done solely via CLI and not the GUI. To an extent, that's fine - I'm good with using the CLI for quite a bit - but it did defeat some of the benefit of having a nice front-end app when I would regularly drop down to the CLI anyway for disk import/export, some device assignments, and so forth. That's not a bug or anything, but it made the experience feel a bit rickety.

The New Setup

So, in the end, I went crawling back to Windows and Hyper-V. I installed Windows 11 Pro and set up the NVMe drives in Storage Spaces... I was a little peeved that I couldn't use ReFS, since apparently "Pro" and "Pro for Workstations" are two separate versions of Windows somehow, but NTFS should still technically do the job (I'll just have to make sure my backup routine to my TrueNAS server is good). After I bashed at it for a little while to remove all the weird stupid ads that festoon Windows nowadays, I got things into good shape.

Hyper-V remains a champ here. I loaded up my re-converted VMs and their performance is great: my backup manager is back in business and my dev VMs are speedy like they used to be.

Among the reasons why I wanted to move away from Server 2019 in the first place is that the server-with-desktop-components versions of Windows always lagged behind the client version in a number of ways, and one of them was WSL2. Now that I'm back to a client version, I was able to install that with a little Debian environment, and then configure Docker Desktop to make use of it. With some network fiddling, I got the Docker daemon listening on a local network port and usable for my Testcontainers suites. Weirdly, this means that my Windows-based setup for Docker is actually a bit more efficient than the previous Linux-based one, but I won't let that bother me.

As for games, well... it's native Windows. For better or for worse, that's the best way to run them, and they run great. Baldur's Gate 3 is noticeably snappier with its load times already, and everything else still runs fine.

So, overall, it kind of stings that I went back to Windows as the primary host, but I can't deny that I'm already deriving a lot of benefits from it. I'll miss some things from Proxmox, like the smooth handling of automatic mounting of network shares as opposed to Windows's schizophrenic approach, but I'm otherwise pleased with how it's working again.

Homelab Rework: Phase 1

Mon Jul 10 10:22:48 EDT 2023

Tags: homelab linux
  1. Planning a Homelab Rework
  2. Homelab Rework: Phase 1
  3. Homelab Rework: Phase 2
  4. Homelab Rework: Phase 3 - TrueNAS Core to Scale

I mentioned the other week that I've been pondering ways to rearrange the servers I have in the basement here, which I'm presumptuously calling a "homelab".

After some fits and starts getting the right parts, I took the main step over the weekend: reworking terminus, my gaming PC and work VM host. In my earlier post, my choice of OS was a bit up in the air, but I did indeed end up going with the frontrunner, Proxmox. While the other candidates had their charms, I figured it'd be the least hassle long-term to go with the VM-focused utility OS when my main goal was to run VMs. Moreover, Proxmox, while still in the category of single-vendor-run OSes, is nonetheless still open-source in a way that should be reliable.

The Base Drive Setup

terminus, being a Theseus-style evolution of the desktop PC I've had since high school, is composed generally of consumer-grade parts, so the brands don't matter too much for this purpose. The pertinent part is how I reworked the storage: previously, I had had three NVMe drives: one 256GB one for the system and then two distinct 2TB drives formatted NTFS, with one mounted in a folder in the other that had grown too big for its britches. It was a very ad-hoc approach, having evolved from earlier setups with other drives, and it was due for a revamp.

For this task, I ended up getting two more 2TB NVMe drives (now that they're getting cheap) and some PCIe adapters to hold them beyond the base capacity of the motherboard. After installing Proxmox on the 256GB one previously housing Windows, I decided to join the other four in a RAID-Z with ZFS, allowing for one to crap out at a time. I hit a minor hitch here: though they're all 2TB on paper, one reported itself as being some tiny sliver larger than the other three, and so the command to create the pool failed in the Proxmox GUI. Fortunately, the fix is straightforward enough: the log entry in the UI shows the command, so I copied that, added "-f" to force creation based on the smallest common size, and ran the command in the system shell. That worked just fine. This was a useful pace-setting experience too: while other utility OSes like pfSense and TrueNAS allow you to use the command line, it seems to be more of a regular part of the experience with Proxmox. That's fine, and good to know.

Quick Note On Repositories

Proxmox, like the other commercial+open utility OSes, has its "community"-type variant for free and the "enterprise" one for money. While I may end up subscribing to the latter one day, it'd be overkill for this use for now. By default, a Proxmox installation is configured to use the enterprise update repositories, which won't work if you don't set up a license key. To get on the community train, you can configure your apt sources. Specifically, I commented out the enterprise lines in the two pre-existing files in /etc/apt/sources.list.d/ and then added my own "pve-ce.list" file with the source from the wiki:

deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription

Importing Old Windows VMs

My first task was to make sure I'd be able to do work on Monday, so I set out to import the Hyper-V Windows VMs I use for Designer for a couple clients. Before destroying Windows, I had copied the .vhdx files to my NAS, so I set up a CIFS connection to that in the "Storage" section of the Proxmox GUI, basically a Proxmox-managed variant of adding an automatic mount to fstab.

From what I can tell, there's not a quick "just import this Hyper-V drive to my VM" process in the Proxmox GUI, so I did some searching to find the right way. In general, the tack is this:

  • Make sure you have a local storage location set to house VM Disk Images
  • Create a new VM with a Windows type in Proxmox and general settings for what you'd like
  • On the tab where you can add disks, delete the one it auto-creates and leave it empty
  • On the command line, go to the directory housing your disk image and run a command in the format qm importdisk <VMID> <imagename>.vhdx <poolname> --format qcow2. For example: qm importdisk 101 Designer.vhdx images --format qcow2
  • Back in the GUI (or the command line if you're inclined - qm is a general tool for this), go to your VM, find the imported-but-unattached drive in "Hardware", and give it an index other than 0. I set mine to be ide1, since I had told the VM in Hyper-V that it was an IDE drive
  • In "Options", find the Boot Order and add your newly-attached disk to the list
  • Download the drivers ISO to attach to your VM. Depending on how old your Windows version is, you may have to go back a bit to find one that works (say, 0.1.141 for Windows 7). Upload that ISO to a local storage location set to house ISOs and attach it to your VM
  • Boot Windows (hopefully), let it do its thing to realize its new home, and install drivers from the "CD drive" as needed

If all goes well, you should be able to boot your VM and get cracking. If it doesn't go well, uh... search around online for your symptoms. This path is about the extent of my knowledge so far.

The Windows Gaming Side

My next task was to set up a Windows VM with PCIe passthrough for my video card. This one was a potential dealbreaker if it didn't work - based on my hardware and what I read, I figured it should work, but there's always a chance that consumer-grade stuff doesn't do what it hypothetically should.

The first step here was to make a normal Windows VM without passthrough, so that I'd have a baseline to work with. I decided to take the plunge and install Windows 11, so I made sure to use a UEFI BIOS for the VM and to enable TPM support. I ran into a minor hitch in the setup process in that I had picked the "virtio" network adapter type, which Windows doesn't have driver support for in the installer unless you slipstream it in (which I didn't). Windows is extremely annoyed by not having a network connection at launch, dropping me into a "Let's connect you to a network" screen with no options and no way to skip. Fortunately, there's a workaround: type Shift+F10 to get a command prompt, then run "OOBE\BYPASSNRO", which re-launches the installer and sprouts a "skip" button on this phase. Once I got through the installer, I was able to connect the driver ISO, install everything, and have Windows be as happy as Windows ever gets. I made sure to set up remote access at this point, since I won't be able to use the Proxmox console view with the real video card.

Then, I set out to connecting the real video card. The documentation covers this well, but it's still kind of a fiddly process, sending you back to the command line for most of it. The general gist is that you have to explicitly enable IOMMU in general and opt in your device specifically. As a note, I had to enable the flags that the documentation says wouldn't be necessary in recent kernel versions, so keep an eye out for that. Before more specifics, I'll say that my GRUB_CMDLINE_LINUX_DEFAULT line ended up looking like this:

GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt intel_iommu=on pcie_acs_override=downstream"

This enables IOMMU in general and for Intel CPUs specifically (the part noted as obsolete in the docs). I'll get to that last bit later. In short, it's an unfortunate concession to my current hardware.

Anyway, back to the process. I went through the instructions, which involved locating the vendor and device identifiers using lspci. For my card (a GeForce 3060), that ended up being "10de:2414" and "01:00.0", respectively. I made a file named /etc/modprobe.d/geforce-passthrough.conf with the following lines (doing a "belt and suspenders" approach to pass through the device and block the drivers, an artifact of troubleshooting):

options vfio-pci ids=10de:2414,01:00.0
blacklist nvidiafb
blacklist nvidia
blacklist nouveau

The host graphics are the integrated Intel graphics on the CPU, so I didn't need to worry about needing the drivers otherwise.

With this set, I was able to reboot, run lspci -nnk again, and see that the GPU was set to use "vfio-pci" as the driver, exactly as needed.

So I went to the VM config, mapped this device, launched the VM, and... everything started crapping out. The OS was still up, but the VM never started, and then no VMs could start, nor could I do anything with the ZFS drive. Looking at the pool listing, I saw that two of the NVMe drives had disappeared from the listing, which was... alarming. I hard-rebooted the system, tried the same thing, and got the same results. I started to worry that the trouble was the PCIe->NVMe adapter I got: the two missing drives were attached to the same new card, and so I thought that it could be that it doesn't work well under pressure. Still, this was odd: booting the VM was far less taxing on those drives specifically than all the work I had done copying files over and working with them, and the fact that it consistently happened when starting it made me think that wasn't related.

That led me to that mildly-unfortunate workaround above. The specific trouble is that PCIe devices are controlled in groups, and my GPU is in the same group as the afflicted PCIe-> NVMe adapter. The general fix for this is to move the cards around, so that they're no longer pooled. However, I only have three PCIe ports of suitable size, two filled with NVMe adapters and one with a video card, so I'm SOL on that front.

This is where the "pcie_acs_override=downstream" kernel flag works. This is something that used to be a special kernel patch, but is present in the stock one nowadays. From what I gather, it's a "don't do this, but it'll work" sort of thing, tricking the kernel into treating same-grouped PCIe devices separately. I think most of the trouble comes in when multiple grouped devices are performing similar tasks (such as two identical video cards), which could lead two OSes to route confusing commands to them. Since the two involved here are wholly distinct, it seems okay. But certainly, I don't love it, and it's something I'll look forward to doing without when it comes time to upgrade the motherboard in this thing.

As a small note, I initially noticed some odd audio trouble when switching away from an active game to a different app within Windows. This seems to be improved by added a dummy audio device to the VM at the Proxmox level, but that could also be a placebo.

But, hackiness aside, it works! I was able to RDP into the VM and install the Nvidia drivers normally. Then, I set up Parsec for low-latency connections, installed some games, and was able to play with basically the same performance I had when Windows was the main OS. Neat! This was one of the main goals, demoting Windows to just VM duty.

Next Steps: Linux Containers and New VMs

Now that I have my vital VMs set up, I have some more work to do for other tasks. A few of these tasks should be doable with Linux Containers, like the VM I had previously used to coordinate cloud backups. Linux Containers - the proper noun - differ from Docker in implementation and idioms, and are closer to FreeBSD jails in practice. Rather than being "immutable image base + mutable volumes" in how you use them, they're more like setting up a lightweight VM, where you pick an OS base and its contents are persistent for the container. My plan is to use this for the backup coordinator and (hopefully) for a direct-install Domino installation I use for some work.

Beyond that, I plan to set up a Linux VM for Docker use. While I could probably hypothetically install Docker on the top-level OS, that gives me the willies for a utility OS. Yes, Proxmox is basically normal old Debian with some additions, but I still figure it's best to keep the installation light on bigger-ticket modifications and, while I'm not too worried about the security implications with my type of use, I don't need to push my luck. I tinkered a little with installing Docker inside a Linux Container, but ran into exactly the sort of hurdles that any search for the concept warn you about. So, sadly, a VM will be the best bet. Fortunately, a slim little Debian VM shouldn't be too much worse than a Container, especially with performance tweaks and memory ballooning.

So, in short: so far, so good. I'll be keeping an eye on the PCIe-passthrough hackiness, and I always have an option to give up and switch to a "Windows 11 host with Hyper-V VMs" setup. Hopefully I won't have to, though, and things seem promising so far. Plus, it's all good experience, regardless.

Planning a Homelab Rework

Sun Jun 25 20:39:37 EDT 2023

  1. Planning a Homelab Rework
  2. Homelab Rework: Phase 1
  3. Homelab Rework: Phase 2
  4. Homelab Rework: Phase 3 - TrueNAS Core to Scale

Over the years, I've shifted how I use my various computers here, gaining new ones and shifting roles around. My current setup is a mix of considered choices and coincidence, but it works reasonably well. "Homelab" may be a big term for it, but I have something approaching a reasonable array:

  • nemesis, a little Netgate-made device running pfSense and serving as my router and firewall
  • trantor, a first-edition Mac Pro running TrueNAS Core
  • terminus, my gaming PC that does double duty as my Hyper-V VM host for the Windows and Linux VMs I use for work
  • nephele, a 2013 MacBook Air I recently conscripted to be a Docker host running Debian
  • tethys, my current desktop workhorse, an M2 Mac mini running macOS
  • ione, an M1 MacBook Air with macOS that does miscellaneous and travel duty

While these things work pretty well for me, there are some rough edges. tethys and ione need no changes, but others are up in the air.

The NAS

trantor's Xeon is too old to run the bhyve hypervisor used by TrueNAS Core, so I can't use it for VM duty. It can run jails, which I've historically used for some apps like Plex and Postgres, and those are good. Still, as much as I like FreeBSD, I have to admit that running Linux there would get me a bit more, such as another Domino server (I got Domino running under Linux binary compatibility, but only in a very janky and unreliable way). I'm considering side-grading it to TrueNAS Scale, but the extremely-high likelihood of trouble with that switch on such an odd, old machine gives me pause. As a NAS, it's doing just fine as-is, and I may leave it like that.

The Router

nemesis is actually perfectly fine as it is, but I'm pondering giving it more work. While Linode is still doing well for me, its recent price hike left a bad taste in my mouth, and I wouldn't mind moving some less-essential services to my own computers here. If I do that, I could either map a port through the firewall to an internal host (as I've done previously) or, as I'm considering, run HAProxy directly on it and have it serve as a reverse proxy for all machines in the house. On the one hand, it seems incorrect and maybe unwise to have a router do double-duty like this. On the other, since I'm not going to have a full-fledged DMZ or anything, it's perfectly-positioned to do the job. Assuming its little ARM processor is up to the task, anyway.

Fortunately, this one is kind of speculative anyway. While I used to have some outwardly-available services I ran from in the house, those have fallen by the wayside, so this would be for a potential future case where I do something different.

The VM Host/Gaming PC

terminus, though, is where most of my thoughts have been spent lately. It's running Windows Server 2019, an artifact of a time when I thought I'd be doing more Server-specific stuff with it. Instead, though, the only server task I use it for is Hyper-V, and that runs just the same on client Windows. The straightforward thing to do with it would be to replace Windows Server with Windows 11 - that'd put me back on a normal release train, avoid some edge cases of programs complaining about OS compatibility, and generally be a bit more straightforward. However, the only reason I have Windows as the top-level OS on there at all is for gaming needs. Moreover, it just sits in the basement, and I don't even play games directly at its console anymore - it's all done through Parsec.

That has me thinking: why don't I demote Windows to a VM itself? The processor has a built-in video chipset for basic needs, and I could use PCIe passthrough to give full control of my video card to a Windows VM. I could pick a Linux variant to be the true OS for the machine, run VMs and containers as top-level peers, and still get full speed for games inside the Windows VM.

I have a few main contenders in mind for this job.

The frontrunner is Proxmox, a purpose-built Debian variant geared towards running VMs and Linux Containers as its bread-and-butter. Though I don't need a lot of the good features here, it's tough to argue with an OS that's specifically meant for the task I'm thinking of. However, it's a little sad that the containers I generally run aren't LXC-type containers, and so I'd have to make a VM or container to then run Docker. In the latter form, there wouldn't be too much overhead to it, but it'd be kind of ugly, and it'd feel weird to do a full revamp to then immediately do one of the main tasks only in a roundabout way.

That leads to my dark-horse candidate of TrueNAS Scale. This isn't as obvious a choice, since this machine isn't intended to be a NAS, but Scale bumps Kubernetes and Docker containers to the top level, and that's exactly how I'd plan to use it. I assume that the VM management in TrueNAS isn't as fleshed out as Proxmox, but it's on a known foundation and looks plenty suitable for my needs.

Finally, I could just skip the "utility" distributions entirely and just install normal Debian. It'd certainly be up to the task, though I'd miss out on some GUI conveniences that I'd otherwise use heavily. On the other hand, that'd be a good way to force myself to learn some more fundamentals of KVM, which is the sort of thing that usually pays off down the line.

For Now

Or, of course, I could leave everything as-is! It does generally work - Hyper-V is fine as a hypervisor, and it's only sort of annoying having an extra tier in there. The main thing that will force me to do something is that I'll want to ditch Windows Server regardless, even if I just do an in-place switch to Windows 11. We'll see!