Intro¶

I should give this some proper context; I recently moved to using FreeBSD exclusively after using Linux exclusively for something like 20 years. There’s a variety of reasons why, but mainly I just like FreeBSD better. Sometime I think back in 2004, I took a free class / workshop on FreeBSD that took place over about 3-4 days. I had to get my ass up at the ass crack of dawn to travel from Bellevue to downtown Seattle, but it was worth it because it gave me a perspective that I might not have gotten anywhere else, from a guy who just really loved FreeBSD. This workshop was really above board. Arguably it was better than what you might pay money for these days. Each participant received their own copy of the FreeBSD manual in a 3 ring binder that contained literally everything there is to know about installing and configuring every aspect of the complete operating system. You can still find that manual here: https://docs.freebsd.org/en/books/handbook/

It hasn’t changed a hell of a lot over the years and looking through it I can’t think of anything that needs to change about it. I might be stretching the truth a bit, but old version are available and you can figure that one out for yourself. But FreeBSD is characteristically a complete operating system and the same can’t really be said for Linux. There is a concept called “Software of an Unknown Pedigree” which if the opposite “Software of a known Pedigree” were a thing, it would say a lot about what you can expect in FreeBSD; with exception to FreeBSD Ports which is all software of sometimes unknown and often known pedigree. From the kernel down to the base userland, things are reasonably consistent. If you’ve experienced coming back to FreeBSD after 20 years you might also say it’s very familiar despite the disparity in years:

/etc/rc.conf
/etc/rc.local
/etc/make.conf
/boot/loader.conf
/usr/src/
/usr/src/sys
/usr/src/sys/<arch>/conf
/usr/ports
/usr/local

What is new for me are the amenities that I’ve come to expect for a modern operating system:

ZFS; subvolumes and zvols that operationally function as geoms
pkg for when you just really don’t feel like building ports
Routing tables; the so-called setfib(1) or setfib(2) as opposed to just a single routing table (Linux has this, too)
Paravirtualization; BHyve similar to KVM (also uses virtio)
Containers; BSD has had this for a long time called “jails” and there is even an OCI runtime in the works: https://github.com/dfr/ocijail
Reliable contributions and continued development

Developing for FreeBSD also feels incredibly human; I can’t even explain it: https://github.com/paigeadelethompson/exfat/commit/187c6694c68554f7961b427501373984a0742366 I made this using an LLM and the process is so straight forward that between having an NFS mount of this source code where I can work on it with Cursor and an NFS mount on a dev FreeBSD VM I can make the changes in Cursor, and build/test it and there’s no extra bullshit that I’ve gotta do. If it crashes I get a kernel debugger on the serial console. If I really wanted to push the development of this to completion, I could I just don’t really have a need at the moment. All of the build files for this kernel module and it’s associated user land tools (newfs etc) extend the build assets of /usr/src and it totally builds out of tree as-is, and it’s JUST bsdmake

I won’t get too much into what issues I have with the Linux kernel but allmodconfig:

1cat .config | grep -v "#" | grep "." | wc -l     
215026

and also allmodconfig is not what it sounds like, if it was I wouldn’t have to always compile some stupid module for one of these weirdo wireless USB adapters. The TLDR of this is there’s a lot of weird shit in the Linux kernel like here’s another example:

https://docs.kernel.org/admin-guide/ufs.html

Why is UFS2 write support “experimental” in 2025? https://code.fe80.eu/lynxis/linux/-/blob/v2.6.34-rc1/Documentation/filesystems/ufs.txt?ref_type=tags

And then there’s Linus dude.. I’ve got nothing against the guy really but usually every time I hear his name come up it is the subject of some open source community schadenfreude or it’s about him sperging out and going Gordon Ramsay on some poor asshole. From my perspective, the whole build config system is woefully neglected and is long overdue for a redesign–it doesn’t scale you can’t realistically build a kernel config with scripts/config anymore and essentially everybody who is redistributing the .config you know the one that says:

1#
2# Automatically generated file; DO NOT EDIT.
3# Linux/x86 6.14.0-rc5 Kernel Configuration
4#

which to me the glaring DO NOT EDIT seems to be a sign that it should always just be generated, rather than redistributed. But more to my original point if maintainability really mattered it makes you wonder how the actual fuck UFS2 (of all things) write support has been experimental for 18 years, doesn’t it?

And sure I’m really one to talk about being profesional but I’m not Linus Torvalds that’s for damn sure and I don’t personally attack people and make ridiculous claims about what Intel instruction sets are just a scam while being as ubiquitous and Linus Torvalds; I’m basically nobody and my reputation is already in the gutter. One thing that is shockingly apparent to me though is how little it matters. Very few people have their own opinions about anything anymore which is sad. Unfortunately it kinda seems to be the sentiment of the open source community as late which is to say:

Everything is too complicated just take my word for it
Everything is too complicated to do so don’t even bother
Being wrong about something is unforgivable; this includes suggesting new ideas that are wrong
Be afraid of everything, everything is a conspiracy, LLMs are just plagiarism machines
The foot eater was “right” about “everything”

Just absolute horse shit that makes no sense coming from everybody.

It just seems to me if Linus really gave a shit about code maintainability he could catch more flies with honey than shit if he just said “why don’t you guys start your own fork of the kernel and leave mine the hell alone” because that’s actually more likely what it’s really about anyway. And really that’s the only issue that I have with Linus. I sure as hell don’t look up to him, but a lot of people do and I think it’s ridiculous that somebody who people look up to so much would be such a dick to people. Totally fair to have his own interests at heart though, he’s definitely earned it.

There’s tons of stuff in the Linux kernel that I’ve wanted to see improved that really just seem hopeless like:

VRF
nftables
i915 11th gen was supposed to have SRIOV then they pushed it up to 12, now we’re up to gen 15 and I still don’t think they’ve even got it. One of Linus’ pet trolls, Phoronix even pointed this out, at least we can agree on something: https://www.phoronix.com/news/Intel-More-i915-For-Linux-6.7 looking at my wrist wondering where my watch is..

Hard to want to approach any of these with as many problems that Linux has going on right now. Git has this wonderful thing called “subtrees” and that’s probably more characteristic of what the Linux kernel should look like in 2025 given the sheer volume of shit that is there; where each subtree is a repository for a specific subsystem or driver that really has no relation to the core tree except for being a subtree, and the entire build system should be based loosely on that idea. People have tried to make offloading kernel development easier with shit like dracut and dracut is an absolute piece of crap (no offense to the dude who wrote it) but I hate it and it makes me cringe everytime I see the name. I also thing that building the kernel from a source tree just shouldn’t be as cumbersome as it is but it’s really no surprise given that it’s literally the same shit that since 2.6 and a lot of it probably predates even that.

With something like sub trees, you could probably inject them into a build process. I sorta always liked golang’s rudimentary dependency management where packages are <repository>/owner/package but I think in the case of golang it might be too little for some. I think that could work for something like the Linux kernel, though because without injecting anything into the tree, you’d be left with only what lives up to Linus’ standard and essentially that is what is most important after all. At the same time I think it’s valuable to be able to build kernel code out of tree when possible, and that should be taken into consideration as well.

git subtree is available in stock version of Git since May 2012 – v1.

This is not to be confused with git submodule they are two separete things enitrely: https://stackoverflow.com/questions/12349931/what-came-first-git-subtree-merge-strategy-or-git-submodule

My BHyve setup¶

Yeah anyway, sometimes you just need a different perspective, libvirt, kvm / qemu, Linux VRF/NetNS wasn’t doing it for me. I really don’t like ip rule on Linux either and there’s none of that on FreeBSD. When it matters, as far as routing tables are concerned your interfaces have either fib or tunnelfib and interfaces which possess the ability to specify tunnelfib can operate on two different FIBs at the same time. Another thing that makes a lot more sense to me and even before I really got deep into Linux networking back in 2005 bridges on FreeBSD where “enslaved” interfaces are still configurable as interfaces while enslaved by the layer 2 bridge, and this is intended. You can also assign an address to the bridge in addition to assigning addresses to your enslaved devices which is nice in my opinion.

A FIB or routing table alone is not quite the same thing as a VRF, but it is essentially how I’ll be using them. There is a small part of this, as it relates to FRR that I still need to figure out because the concept of a VRF in FRR still needs to apply and each VRF will need to identify with a particular FIB on FreeBSD (more on that later.)

bhyve-vm alone wasn’t enough¶

All of the configuration I’ll be referencing is available here: https://gist.github.com/paigeadelethompson/a94ff2e7cc4916d7feecef96936bb2d7 Typically to create a VM with my setup I run either:

./create_freebsd_vm.sh FBSDDEV1 -t fbsd-dev -v 14.2 -i tap5
./create_void_vm.sh SWARM3 -t swarm -i tap2

So it is necessary for now to create a tap interface and assign it to the correct FIB before running these:

1ifconfig tap<N> create fib <N>

I wanted something that will:

Create a VM from scratch (both Linux and FreeBSD) without needing to boot an ISO and manually go through an installer, partitioning, etc. (I made my own scripts for both Void and FreeBSD, which I’ve already mentioned)
bhyve-vm switch doesn’t support FIB specification; it doesn’t need to really I can do all of the networking setup that I need in rc.conf:

  1chronyd_enable=YES
  2dnsmasq_enable=YES 
  3sshd_enable=YES
  4hostname=stelleri.netcrave.network 
  5powerd_enable=YES
  6moused_nondefault_enable=NO
  7dumpdev=NO
  8zfs_enable=YES
  9gateway_enable=YES
 10#ipv6_gateway_enable=YES
 11lldpd_enable=YES 
 12linux_enable=YES
 13pf_enable=YES
 14nfs_server_enable=YES
 15nfsv4_server_enable=YES
 16nfsuserd_enable=YES
 17rpcbind_enable=YES
 18mountd_enable=YES
 19mountd_flags=-r
 20vm_enable=YES
 21vm_dir=zfs:storage/vm
 22frr_enable=YES
 23
 24# LAN
 25ifconfig_ix1="inet 192.168.1.128/24 fib 0"
 26#ifconfig_ix1_ipv6="inet6 fcff:fff0::/64 fib 0"
 27
 28# Docker swarm 
 29ifconfig_igb0="inet 198.18.2.1/23 fib 8"
 30#ifconfig_igb0_ipv6="inet6 fcff:8::/64 fib 8"
 31
 32# Home servers 
 33ifconfig_igb1="inet 192.168.65.129/25 fib 10"
 34#ifconfig_igb1_ipv6="inet6 fcff:12::/64 fib 10"
 35
 36# Docker swarm servers VGW                                        
 37ifconfig_epair0a="192.0.0.0/31 fib 0 up"
 38#ifconfig_epair0a_ipv6="inet6 fcff:ffff:8::a/64 fib 0 up"
 39ifconfig_epair0b="192.0.0.1/31 fib 8 up"
 40#ifconfig_epair0b_ipv6="inet6 fcff:ffff:8::b/64 fib 8 up"
 41
 42# Home servers VGW
 43ifconfig_epair1a="192.0.0.2/31 fib 0 up" 
 44#ifconfig_epair1a_ipv6="inet6 fcff:ffff:10::a/64 fib 0 up"
 45ifconfig_epair1b="192.0.0.3/31 fib 10 up"
 46#ifconfig_epair1b_ipv6="inet6 fcff:ffff:10::b/64 fib 10 up"
 47
 48# Tailscale VGW
 49ifconfig_epair2a="192.0.0.4/31 fib 0 up" 
 50#ifconfig_epair2a_ipv6="inet6 fcff:ffff:12::a/64 fib 0 up"
 51ifconfig_epair2b="192.0.0.5/31 fib 12 up"
 52#ifconfig_epair2b_ipv6="inet6 fcff:ffff:12::b/64 fib 12 up"
 53
 54# VM interfaces (FIB assignment)
 55ifconfig_tap0="fib 8 up"  # SWARM1
 56ifconfig_tap1="fib 8 up"  # SWARM2
 57ifconfig_tap2="fib 8 up"  # SWARM3
 58ifconfig_tap3="fib 10 up" # HOME1
 59ifconfig_tap4="fib 12 up" # TAILSCALE1
 60ifconfig_tap5="fib 10 up" # FBSDDEV1
 61
 62# Docker swarm virtual switch 
 63ifconfig_bridge0="198.18.0.1/23 fib 8 up"
 64#ifconfig_bridge0_ipv6="inet6 fcff:8::1/64 fib 8 up"
 65ifconfig_bridge0_aliases="inet 169.254.169.254/16 alias addm igb0 addm tap0 addm tap1 addm tap2"
 66
 67# Home servers virtual switch
 68ifconfig_bridge1="192.168.64.129/25 fib 10 up"
 69#ifconfig_bridge1_ipv6="inet6 fcff:10::1/64 fib 10 up"
 70ifconfig_bridge1_aliases="inet 169.254.169.254/16 alias addm igb1 addm tap3 addm tap5"
 71
 72# Tailscale virtual switch
 73ifconfig_bridge2="192.0.2.1/30 fib 12 up"
 74#ifconfig_bridge2_ipv6="inet6 fcff:12::1/64 fib 12 up"
 75ifconfig_bridge2_aliases="inet 169.254.169.254/16 alias addm tap4"
 76
 77# This must list all interface variables for interfaces that don't exist yet
 78cloned_interfaces="bridge0 bridge1 bridge2 epair0 epair1 epair2 \
 79                   tap0 tap1 tap2 tap3 tap4"
 80
 81# Core routes (FIB 0)
 82route_fib0_swarm="-fib 0 -net 198.18.0.0/23 192.0.0.1"            # 198.18.0.0 - 198.18.1.255
 83#ipv6_route_fib0_swarm="-fib 0 -6 fcff:8::/48 fcff:ffff:8::b"
 84route_fib0_home="-fib 0 -net 192.168.64.128/24 192.0.0.3"         # My 192.168.64.0/20 (2nd /25 of 1st /24 of /20)
 85#ipv6_route_fib0_home="-fib 0 -6 fcff:10::/48 fcff:ffff:10::b"
 86route_fib0_ts="-fib 0 -net 192.0.2.0/30 192.0.0.5"                # Tailcale VRF
 87#ipv6_route_fib0_ts="-fib 0 -6 fcff:12::/48 fcff:ffff:12::b"
 88route_fib0_egr_ts="-fib 0 -net 100.64.0.0/10 192.0.0.5"           # Tailscale uses 100.64.0.0/10
 89#ipv6_route_fib0_egr_ts="-fib 0 -6 fd7a:115c::/32 fcff:ffff:12::b"
 90
 91# Default egress (For all FIBs)
 92route_fib0_default="-fib 0 default 192.168.1.1"
 93route_fib8_default="-fib 8 default 192.0.0.0"
 94#ipv6_route_fib8_default="-fib 8 -6 fcff::/7 fcff:ffff:8::a"
 95route_fib10_default="-fib 10 default 192.0.0.2"
 96#ipv6_route_fib10_default="-fib 10 -6 fcff::/7 fcff:ffff:10::a"
 97route_fib12_default="-fib 12 default 192.0.0.4"
 98#ipv6_route_fib12_default="-fib 12 -6 fcff::/7 fcff:ffff:12::a"
 99
100
101# Egress to Tailscale (FIB 12)
102route_fib12_egr_ts="-fib 12 -net 100.64.0.0/10 192.0.2.2"
103#ipv6_route_fib12_egr_ts="-fib 12 -6 fd7a:115c::/32 fcff:12::192:0:2:2"
104
105# Null routes (All FIBs)
106route_fib8_null_fib0="-fib 8 -net 192.168.0.0/16 -reject"   # Swarm to UDM & Home (and anything else)
107#ipv6_route_fib8_null_fib0="-fib 8 -6 fcff::/48 -reject"
108route_fib10_null_fib8="-fib 10 -net 198.18.0.0/15 -reject"  # Home servers to Swarm
109#ipv6_route_fib10_null_fib8="-fib 10 -6 fcff:8::/48 -reject"
110route_fib12_null_fib0="-fib 12 -net 192.168.0.0/20 -reject" # 192.168.0.0/20 UDM Networks(LAN/WiFi/etc)
111#ipv6_route_fib12_null_fib0="-fib 12 -6 fcff::/48 -reject"
112route_fib0_null_vgw="-fib 0 -net 192.0.0.0/24 -reject"      # Prevent forwarding for VGW addresses
113#ipv6_route_fib0_null_vgw="-fib 0 -6 fcff:ffff::/32 -reject"
114route_fib0_null_ll="-fib 0 -net 169.254.0.0/16 -reject"     # Prevent forwarding for link-local
115
116# This must list all route variables        
117static_routes="fib0_swarm fib0_home fib0_ts fib0_egr_ts fib0_default fib8_default \
118               fib10_default fib12_default fib12_egr_ts fib8_null_fib0            \
119               fib0_null_vgw fib0_null_ll fib10_null_fib8 fib12_null_fib0"
120
121# ipv6_static_routes="fib0_swarm fib0_home fib0_ts fib0_egr_ts fib8_default   \
122#                     fib10_default fib12_default fib12_egr_ts fib8_null_fib0 \
123#                     fib0_null_vgw fib10_null_fib8 fib12_null_fib0"

Continuing the list:

Zeroconf networking; This is made possible using lldpd on both the host and the guest, as well as avahi-autoipd on the guest. I can run lldpctl on the host to retrieve information about the guest:

 1-------------------------------------------------------------------------------
 2Interface:    tap5, via: LLDP, RID: 13, Time: 0 day, 06:41:58
 3  Chassis:     
 4    ChassisID:    mac 58:9c:fc:0b:39:9f
 5    SysName:      FBSDDEV1
 6    SysDescr:     FreeBSD 14.2-RELEASE FreeBSD 14.2-RELEASE FreeBSD 14.2-RELEASE releng/14.2-n269506-c8918d6c7412 GENERIC amd64
 7    MgmtIP:       169.254.10.136
 8    MgmtIface:    1
 9    MgmtIP:       fe80::5a9c:fcff:fe0b:399f
10    MgmtIface:    1
11    Capability:   Bridge, off
12    Capability:   Router, off
13    Capability:   Wlan, off
14    Capability:   Station, on
15  Port:        
16    PortID:       mac 58:9c:fc:0b:39:9f
17    PortDescr:    vtnet0
18    TTL:          120
19    PMD autoneg:  supported: yes, enabled: yes
20      MAU oper type: 10GigBaseCX4 - X copper over 8 pair 100-Ohm balanced cable
21-------------------------------------------------------------------------------

and thus in addition to being able to vm console attach the guest, I can also SSH it before it’s even setup:

1➜  /etc setfib -F 10 ssh -i /vm/FBSDDEV1/id_ed25519 root@169.254.10.136 "uname -a"         
2FreeBSD FBSDDEV1 14.2-RELEASE FreeBSD 14.2-RELEASE releng/14.2-n269506-c8918d6c7412 GENERIC amd64

You may also notice that every bridge has the same IP address 169.254.169.254 specified and this is possible because without being on the same routing table, they can’t overlap. In order to tell the operating system which routing table should be used when looking up the route for the network, the setfib command is used. It’s essentially the same thing as ip vrf exec or ip netns exec if you’re familiar.

You might also be wondering “wtf is LLDP” and you should, because it’s bad ass: https://en.wikipedia.org/wiki/Link_Layer_Discovery_Protocol If ever there was a layer 2 protocol that I would want to have on everything it’d be this. It’s a decent compromise for lack of having SNMP and it just makes my life easier. You’ll also find that a lot of high end switches / routers also use LLDP:

 1-------------------------------------------------------------------------------
 2Interface:    ix1, via: LLDP, RID: 2, Time: 3 days, 23:46:31
 3  Chassis:     
 4    ChassisID:    mac 78:45:58:6a:e2:b9
 5    SysName:      USW-Aggregation
 6    SysDescr:     UBNT-USL8A
 7    Capability:   Bridge, on
 8  Port:        
 9    PortID:       local Port 3
10    PortDescr:    SFP_ 3
11    TTL:          120
12  VLAN:         1, pvid: yes
13  LLDP-MED:    
14    Device Type:  Network Connectivity Device
15    Capability:   Capabilities, yes
16    Capability:   Policy, yes
17-------------------------------------------------------------------------------

Serial console is always an option:

 1➜  /etc vm console FBSDDEV1       
 2Connected
 3
 4
 5FreeBSD/amd64 (FBSDDEV1) (ttyu0)
 6
 7login: root
 8Apr 24 00:22:10 FBSDDEV1 login[772]: ROOT LOGIN (root) ON ttyu0
 9FreeBSD 14.2-RELEASE (GENERIC) releng/14.2-n269506-c8918d6c7412
10
11Welcome to FreeBSD!
12
13Release Notes, Errata: https://www.FreeBSD.org/releases/
14Security Advisories:   https://www.FreeBSD.org/security/
15FreeBSD Handbook:      https://www.FreeBSD.org/handbook/
16FreeBSD FAQ:           https://www.FreeBSD.org/faq/
17Questions List:        https://www.FreeBSD.org/lists/questions/
18FreeBSD Forums:        https://forums.FreeBSD.org/
19
20Documents installed with the system are in the /usr/local/share/doc/freebsd/
21directory, or can be installed later with:  pkg install en-freebsd-doc
22For other languages, replace "en" with a language code like de or fr.
23
24Show the version of FreeBSD installed:  freebsd-version ; uname -a
25Please include that output and any error messages when posting questions.
26Introduction to manual pages:  man man
27FreeBSD directory layout:      man hier
28
29To change this login announcement, see motd(5).
30root@FBSDDEV1:~ # ^D
31
32FreeBSD/amd64 (FBSDDEV1) (ttyu0)
33
34login:  
35
36FreeBSD/amd64 (FBSDDEV1) (ttyu0)
37
38login: ~
39[EOT]
40➜  /etc

The setup scripts¶

On a high level, both the create_void_vm.sh and create_freebsd_vm.sh scripts both do the following:

Create the VM; disk and configuration file from templates stored in /vm/.templates
Each VM disk is a zvol, configured in geom mode
The VM disk is partitioned with a GPT label; FAT32 for the EFI partition and UFS2 for the root filesystem; FreeBSD doesn’t have userland tools for creating ext4 or btrfs filesystems (at least not in base userland) so I opted for using UFS2 which unfortunately requires an experimental option for write-support in Linux at the moment.
The filesystems are formatted and mounted to a chroot path
The base userland is downloaded and extracted in the chroot path
A setup.sh script is created in the chroot path as well as an SSH authorized_keys file; an id_ed25519 is also generated and stored in the VM configuration directory in /vm
A chroot is performed (in the case of Linux, FreeBSD has Linux binary compatibility which makes this possible)
The setup.sh script runs in the chroot and sets up everything; it also in the case of Linux compiles a custom kernel with as many networking options as I could possibly scrape like a stoner desperately scraping weed resin to smoke from a pipe. Total disaster but I think I just about got everything that is needed for networking, iptables and nftables (Docker works at least.) Obviously not ideal, but it bypasses the need for an initramfs and the “experimental” UFS2 write support is enabled allowing this to work.
For freeBSD, the bootloader is a little more straight forward: cp /boot/loader.efi /boot/efi/efi/boot/bootx64.efi but on Linux, running GRUB or efibootmgr inside of a chroot on FreeBSD under Linux binary compatibility is a bit of a stretch, so I had to get creative and luckily BHyve’s coreboot comes with something called EFIShell: https://github.com/tianocore/tianocore.github.io/wiki/Efi-shell

EFIShell is the default EFI application when no boot device is specified in EFIVars. It first checks to see if a startup.nsh exists, and runs it if it does:

1fs0:\efi\boot\vmlinuz console=ttyS0 root=/dev/vda2 rootflags=ufstype=ufs2 rootfstype=ufs

That’s right, the Linux kernel itself works as an EFI application thanks to CONFIG_EFI_STUB There’s admittedly a few different ways including just naming the vmlinuz bootx64.efi to make this work, although I found this way to be the most convienent for the kernel cmdline rather than an accompanying boot config, see:

CONFIG_BOOT_CONFIG
CONFIG_BOOT_CONFIG_FORCE
CONFIG_BOOT_CONFIG_EMBED

for more information about that.

And so bootstrapping this was pretty straight forward both for freeBSD and Linux. Continuing with the VM creation scripts:

After the chroot completes the filesystems are unmounted, and the tap<N> interface that was specified to the creation script is appended to the VM configuration.

The VMs boot, the work on the correct networks, I can find them with lldpctl if not just vm attach <VM> and ssh into them…preferred because console attach kinda sucks. So what’s left?

Route leaking¶

This part I am still working on, because I want to use VRF in FRR and I want per-VRF BGP/OSPF. For the time being, there is a simple approach that seems to work:

1! 
2router rip 
3network ix1
4route 192.168.64.128/29
5exit
6!

Gives you something that looks kinda like this:

 1stelleri.netcrave.network# show ip rip
 2Codes: K - kernel route, C - connected, L - local, S - static,
 3       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
 4       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
 5       f - OpenFabric, t - Table-Direct
 6Sub-codes:
 7      (n) - normal, (s) - static, (d) - default, (r) - redistribute,
 8      (i) - interface
 9
10     Network            Next Hop         Metric From            Tag Time
11C(i) 192.168.1.0/24     0.0.0.0               1 self              0
12R(s) 192.168.64.128/29  0.0.0.0               1 self              0

And it appears to work as expected:

1 00:00:10.177590 98:b7:85:1e:de:4e > 01:00:5e:00:00:09, ethertype IPv4 (0x0800), length 66: (tos 0xc0, ttl 1, id 49765, offset 0, flags [none], proto UDP (17), length 52, bad cksum 0 (->5462)!)
2    192.168.1.128.520 > 224.0.0.9.520: [bad udp cksum 0xa263 -> 0x5645!] 
3        RIPv2, Response, length: 24, routes: 1 or less
4          AFI IPv4,  192.168.64.128/29, tag 0x0000, metric: 1, next-hop: self
5        0x0000:  0202 0000 0002 0000 c0a8 4080 ffff fff8
6        0x0010:  0000 0000 0000 0001

Ideally I want to use BGP/OSPF but also being able to specify just a VRF from which to leak routes and a list of routes that shouldn’t be leaked (eg: default, 192.0.0.0/24, 169.254.0.0/16 etc) sometimes simpler is better though.

pf.conf¶

The virtual gateways need to be able to route traffic to the internet:

1table <resvd_networks> { 0.0.0.0/8 10.0.0.0/8 100.64.0.0/10 127.0.0.0/8 169.254.0.0/16
2                         172.16.0.0/12 192.0.0.0/24 192.0.2.0/24 192.88.99.0/24
3                         192.168.0.0/16 198.18.0.0/15 198.51.100.0/24 203.0.113.0/24
4                         224.0.0.0/4 233.252.0.0/24 240.0.0.0/4 255.255.255.255/32 }
5
6nat on ix1 inet from 198.18.0.0/23 to !<resvd_networks> -> ix1
7nat on ix1 inet from 192.168.64.128/25 to !<resvd_networks> -> ix1
8nat on ix1 inet from 192.0.2.0/30 to !<resvd_networks> -> ix1

The epair interfaces¶

These are used to create a link (single hop) between FIB 0 (the default FIB) and each of the other FIBs. It’s possible to configure a link between two other FIBs as well, but in this configuration currently I don’t have any use for ad-hoc networks. To prevent undesired routing between FIBs, null routes are used.

What’s left?¶

IPv6; currently there is a problem with IPv6 and it doesn’t work with epairs the way I would expect:

 1➜  stelleri ifconfig epair128 create               
 2epair128a
 3➜  stelleri ifconfig epair128b inet6 fcff::b/64 fib 128
 4➜  stelleri ifconfig epair128a inet6 fcff::a/64 fib 0  
 5➜  stelleri ping6 -S fcff::a fcff::b
 6PING(56=40+8+8 bytes) fcff::a --> fcff::b
 7^C
 8--- fcff::b ping statistics ---
 93 packets transmitted, 0 packets received, 100.0% packet loss
10➜  stelleri ndp -a
11Neighbor                             Linklayer Address  Netif Expire    S Flags
12fe80::df:98ff:feaf:9e0a%epair128a    02:df:98:af:9e:0a epair128a permanent R 
13fcff::a                              02:df:98:af:9e:0a epair128a permanent R 
14fe80::df:98ff:feaf:9e0b%epair128b    02:df:98:af:9e:0b epair128b permanent R 
15fcff::b                              02:df:98:af:9e:0b epair128b permanent R

moving epair128b back to FIB 0 we can get a different result:

 1➜  stelleri ifconfig epair128b inet6 fcff::b/64 fib 0  
 2➜  stelleri ping6 -S fcff::a fcff::b                 
 3PING(56=40+8+8 bytes) fcff::a --> fcff::b
 416 bytes from fcff::b, icmp_seq=0 hlim=64 time=0.107 ms
 5^C
 6--- fcff::b ping statistics ---
 71 packets transmitted, 1 packets received, 0.0% packet loss
 8round-trip min/avg/max/stddev = 0.107/0.107/0.107/0.000 ms
 9Neighbor                             Linklayer Address  Netif Expire    S Flags
10fe80::df:98ff:feaf:9e0a%epair128a    02:df:98:af:9e:0a epair128a permanent R 
11fcff::a                              02:df:98:af:9e:0a epair128a permanent R 
12fe80::df:98ff:feaf:9e0b%epair128b    02:df:98:af:9e:0b epair128b permanent R 
13fcff::b                              02:df:98:af:9e:0b epair128b permanent R

Looking at the NDP entries You can’t really tell that anything is wrong, however if you move epair128b back to FIB 128:

 1➜  stelleri ifconfig epair128b inet6 fcff::b/64 fib 128
 2➜  stelleri ping6 -S fcff::a fcff::b                   
 3PING(56=40+8+8 bytes) fcff::a --> fcff::b
 416 bytes from fcff::b, icmp_seq=0 hlim=64 time=0.112 ms
 516 bytes from fcff::b, icmp_seq=1 hlim=64 time=0.102 ms
 616 bytes from fcff::b, icmp_seq=2 hlim=64 time=0.104 ms
 716 bytes from fcff::b, icmp_seq=3 hlim=64 time=0.100 ms
 816 bytes from fcff::b, icmp_seq=4 hlim=64 time=0.128 ms
 9^C
10--- fcff::b ping statistics ---
115 packets transmitted, 5 packets received, 0.0% packet loss
12round-trip min/avg/max/stddev = 0.100/0.109/0.128/0.010 ms
13➜  stelleri ndp -a
14Neighbor                             Linklayer Address  Netif Expire    S Flags
15fe80::df:98ff:feaf:9e0a%epair128a    02:df:98:af:9e:0a epair128a permanent R 
16fcff::a                              02:df:98:af:9e:0a epair128a permanent R 
17fe80::df:98ff:feaf:9e0b%epair128b    02:df:98:af:9e:0b epair128b permanent R 
18fcff::b                              02:df:98:af:9e:0b epair128b permanent R

It works, so I do believe this is an issue with NDP but I need to find somebody to help me triage this.

Impressions¶

This rocks, really don’t know how the hell I would get libvirt to do this but thankfully I don’t even have to think about it because I have this and it works. Creating a Docker network driver is also a pain in the ass, but I also don’t really need to do that either and really the only thing I care about is that there is some basic isolation between the swarm network and my home network:

 1➜  stelleri setfib -F 8 ssh -i /vm/SWARM1/id_ed25519 admin@198.18.0.2 "sudo docker node inspect rkssiknkct3cc6tlg1nb5ptfw" | jq '.[] | .ManagerStatus' 
 2{
 3  "Leader": true,
 4  "Reachability": "reachable",
 5  "Addr": "100.97.94.117:2377"
 6}
 7➜  stelleri setfib -F 8 ssh -i /vm/SWARM1/id_ed25519 admin@198.18.0.2 "sudo ping 100.97.94.117"  
 8PING 100.97.94.117 (100.97.94.117) 56(84) bytes of data.
 964 bytes from 100.97.94.117: icmp_seq=1 ttl=60 time=162 ms
1064 bytes from 100.97.94.117: icmp_seq=2 ttl=60 time=159 ms
11^C
12➜  stelleri setfib -F 8 ssh -i /vm/SWARM1/id_ed25519 admin@198.18.0.2 "sudo ping 192.168.1.1"   
13PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
14From 198.18.0.1 icmp_seq=1 Destination Host Unreachable

100.97.94.117 is a tailscale address

Things I’d like to improve¶

Adding another FIB for HE.net (need ICMP on WAN, and for IPv6 to work correctly with FIBs on FreeBSD)
Offloading some of the networking to another router; creating more of a buffer between the server that hosts live guests and a host that is separately responsbible for isolating the networks apart from each other. This will be very easy to do with or without VLANs; I would just have to add another device such as my Zimaboard which has been sitting on my shelf doing nothing for two years. In terms of what I want to host I’ll follow up on another blog post, but essentially I’m looking at something like:
HTTP/HTTPS/3:

1Internet -> Cloudflare -> (Origin certificate / mTLS authenticated) -> Traefik (docker swarm) -> *shrug* Matrix server I guess?

I would like to setup a BBS with synchronet, though I think Cloudflare free tier will be out of the question at least I haven’t seen anything that would lead me to believe that I can setup SNI routing and it relies on the service itself supporting PROXY; my HE.net tunnel would be good for this, it would just be IPv6-only.
Plenty of topics for other blog posts

I’d like to also add a FIB dedicated for a Squid forward proxy; and remove the default routes from the SWARM and HOME FIB; this way they don’t have direct access to the internet but rather have to use HTTP_PROXY which would be the Squid proxy. This would be a nice way to air gap these networks and would allow for more control over what is actually reachable from these networks.

sysctl net.inet.ip.accept_sourceroute I haven’t tested this from a source spoofing perspective yet but I’ll bet there are problems.
Multicast routing, pimd doesn’t quite work with FreeBSD’s FRR port atm: https://troglobit.com/howtos/pimd-on-freebsd/

Anyway, everything is working fine for the most part I’m pretty satisfied with it:

1➜  pub vm list
2NAME        DATASTORE  LOADER  CPU  MEMORY  VNC  AUTO  STATE
3FBSDDEV1    default    uefi    4    2048M   -    No    Running (35391)
4HOME1       default    uefi    4    2048M   -    No    Running (17220)
5SWARM1      default    uefi    4    2048M   -    No    Running (3770)
6SWARM2      default    uefi    4    2048M   -    No    Running (3528)
7SWARM3      default    uefi    4    2048M   -    No    Running (3286)
8TAILSCALE1  default    uefi    2    512M    -    No    Running (8945)

An update on my BHyve hypervisor setup