Homelab Rework: Phase 2

Sep 15, 2023, 11:39 AM

Tags: homelab linux
  1. Planning a Homelab Rework
  2. Homelab Rework: Phase 1
  3. Homelab Rework: Phase 2
  4. Homelab Rework: Phase 3 - TrueNAS Core to Scale

CollabSphere 2023 came and went the other week, and I have some followup to do from that for sure, not the least of which being the open-sourcing of JNX, but that post will have to wait a bit longer. For now, I'm here to talk about my home servers.

When last I left the topic, I had installed Proxmox as my VM host of choice to replace Windows Server 2019, migrated my existing Hyper-V VMs, and set up a Windows 11 VM with PCIe passthrough for the video card. There were some hoops to jump through, but I got everything working.

Now, though, I've gone back on all of that, or close to it. Why?

Why Did I Go Back On All Of That, Or Close To It?

The core trouble that has dogged me for the last few months is performance. While the host I'm using isn't a top-of-the-line powerhouse (namely, it's using an i7-8700K and generally related-era consumer parts), things were running worse than I was sure they should. My backup-runner Linux VM, which should have been happy as a clam with a Linux host, suffered to the extent that it never actually successfully ran a backup. My Windows dev VMs worked fine, but would periodically just drag when trying to redraw window widgets in a way they hadn't previously. And, most importantly of all, Baldur's Gate 3 exhibited bizarre load-speed problems: the actual graphical performance was great even on the highest settings, but I'd get lags of 10 seconds or so loading assets, much worse than the initial performance trouble reported by others in the release version of the game.

Some of this I chalked up to lack of optimized settings, like how the migrated VMs were using "compatibility" settings instead of all the finest-tuned VirtIO stuff. However, my gaming VM was decked out fully: VirtIO network and disk, highest-capability UEFI BIOS, and so forth. They were all sitting on ZFS across purely-NVMe drives, so they shouldn't have been lacking for disk speed. I tried a bunch of things, like dedicating an SATA SSD to the VM, or passing through a USB 3 SSD, but the result was always the same. Between game updates and re-making the VM in a "lesser" way with Windows 10, I ended up getting okay performance, but the speed of the other VMs bothered me.

Now, I don't want to throw Proxmox specifically or KVM generally under the bus here. It's possible that I could have improved this situation - perhaps, despite my investigations and little tweaks, I had things configured poorly. And, again, this hardware isn't built for the purpose, but instead I was cramming server-type behavior into "prosumer"-at-best hardware. Still, Hyper-V didn't have this trouble, so it nagged at me.

But Also Containers

As I mentioned towards the bottom of the previous post, Proxmox natively uses Linux Containers and not Docker, but I wanted to see what I could do about that. I tried a few things, installing Docker inside an LXC container as well as on the main host OS, but ran into odd filesystem-related problems within Dockerfiles. I found ways to work around those by doing things like deleting just files instead of directory trees, but I didn't want to go and change all my project Dockerfiles just to account for an odd local system. I had previously used my backup-manager VM for Docker, but that VM's performance trouble made me make a new secondary one. That ended up expanding the overhead and RAM consumption, which defeated some of the potential benefits.

Little Things

Beyond that, there were little things that got to me. Though Proxmox is free, it still gives a little nag screen about being unlicensed the first time you visit the web UI each reboot, which is a mild annoyance. Additionally, it doesn't have built-in support for suspending/resuming active VMs when you reboot the machine, as Hyper-V does - I found some people recommending systemd scripts for this, but that would introduce little timing problems that wouldn't arise if it was a standard capability.

There also ended up being a lot that was done solely via CLI and not the GUI. To an extent, that's fine - I'm good with using the CLI for quite a bit - but it did defeat some of the benefit of having a nice front-end app when I would regularly drop down to the CLI anyway for disk import/export, some device assignments, and so forth. That's not a bug or anything, but it made the experience feel a bit rickety.

The New Setup

So, in the end, I went crawling back to Windows and Hyper-V. I installed Windows 11 Pro and set up the NVMe drives in Storage Spaces... I was a little peeved that I couldn't use ReFS, since apparently "Pro" and "Pro for Workstations" are two separate versions of Windows somehow, but NTFS should still technically do the job (I'll just have to make sure my backup routine to my TrueNAS server is good). After I bashed at it for a little while to remove all the weird stupid ads that festoon Windows nowadays, I got things into good shape.

Hyper-V remains a champ here. I loaded up my re-converted VMs and their performance is great: my backup manager is back in business and my dev VMs are speedy like they used to be.

Among the reasons why I wanted to move away from Server 2019 in the first place is that the server-with-desktop-components versions of Windows always lagged behind the client version in a number of ways, and one of them was WSL2. Now that I'm back to a client version, I was able to install that with a little Debian environment, and then configure Docker Desktop to make use of it. With some network fiddling, I got the Docker daemon listening on a local network port and usable for my Testcontainers suites. Weirdly, this means that my Windows-based setup for Docker is actually a bit more efficient than the previous Linux-based one, but I won't let that bother me.

As for games, well... it's native Windows. For better or for worse, that's the best way to run them, and they run great. Baldur's Gate 3 is noticeably snappier with its load times already, and everything else still runs fine.

So, overall, it kind of stings that I went back to Windows as the primary host, but I can't deny that I'm already deriving a lot of benefits from it. I'll miss some things from Proxmox, like the smooth handling of automatic mounting of network shares as opposed to Windows's schizophrenic approach, but I'm otherwise pleased with how it's working again.

New Comment