Wifibox: fixing IPv6 connectivity with ULA routing

While FreeBSD’s WiFi stack has been improving steadily, it still lags behind Linux. Wifibox bridges that gap by deploying a Linux guest to drive a wireless card on the FreeBSD host via PCI pass-through — and it works remarkably well.

That said, I ran into a connectivity issue: connections to hosts resolving to both IPv4 and IPv6 addresses were silently failing on the IPv6 path. The culprit was IPv6 address selection.

Wifibox relies on ULA (Unique Local Addresses, fd00::/64) for its internal IPv6 routing. The problem is that FreeBSD’s default ip6addrctl policy doesn’t allow routing to a global unicast address (GUA) when the source address is a ULA — so the connection attempt would fail or fall back unexpectedly.

The fix is to add a policy entry for the ULA prefix and enable the custom policy in /etc/rc.conf:

ip6addrctl_policy="AUTO"

Then in /etc/ip6addrctl.conf:

# enable selecting GUA destination with fd00::/64 as source address
fd00::/64 37 1

# default automatic policy (IPv6 preferred)
::1/128          50  0
::/0             40  1
::ffff:0:0/96    35  4
2002::/16        30  2
2001::/32         5  5
fc00::/7          3 13
::/96             1  3
fec0::/10         1 11
3ffe::/16         1 12

This is a direct follow-up to my previous post IPv6 address selection and cross-site ULAs, which covers the underlying mechanics in more detail.

IPv6 address selection and cross-site ULAs

All I wanted was to update some SSH keys…
But for reasons that I’ll spare you, it turned out to be a tad more complicated.
Also, in what follows, the context will be simplified for clarity.

As to make the very basic context of this post clear, this is on FreeBSD, and it’s all about IPv4 vs IPv6 address selection. What address should I use to reach xyz.com? Chances are that if you are just mainly using your web browser, that never appeared to be a problem. See, browsers use what is called the Happy Eyeballs algorithm, basically whoever IPv4/IPv6 responds first gets used.

But it’s generally not implemented outside browsers. So that’s why your package manager decides to use either IPv6 or IPv4, but that selection is unreachable, so it’s just stuck there. That’s also why sometimes web apps seem to work perfectly fine in your browser, but on your phone, they seem very sluggish at best, and most of the time just clicking on any button or link just makes the app hang in there forever.

Chances are the app didn’t implement the aforementioned algorithm; it had a route to reach the destination, but for some reason it isn’t reachable. So it tries IPv6 and just has to wait for some timeout. Hence why I guess most people just disable IPv6. But they really shouldn’t. They should slam their sysadmin/netadmin/ISP for doing a bad job. Those are the ones slowing down IPv6 adoption, not the users.

So back to the problem at hand.

At some point, I had to reach over the same destination (read the same fqdn) via SSH from two different hosts. That destination DNS resolution presented both a A (IPv4) record, and a AAAA (IPv6) record. However on the first host (let’s call him host A), IPv6 was preferred to reach the destination (good). However, on the second host (let’s call him host B), IPv4 was preferred instead (bad).

Both hosts received the exact same DNS resolution (with IPv4+IPv6 resolve), both have a globally routable IPv6 and an IPv6 route to the destination. Both hosts can ping6 to the actual destination. In other words, the destination is perfectly reachable via IPv6. Why then would A prefer IPv6 and B prefer IPv4?

See in FreeBSD system configuration file (/etc/rc.conf), there is ip6addrctl_policy that can be configured to ipv6_prefer. As the name suggests, with this configuration and when presented with both a IPv4 and IPv6 to reach the destination, the network should choose the later by default. And it was configured to this value on both host A and host B, so in both case it should have chosen IPv6, but it didn’t.

Under the ip6addrctl_policy setting is the ip6addrctl command that actually configures the address selection policy. Let’s see what it has to say:

$ ip6addrctl show
Prefix                          Prec Label      Use
::1/128                           50     0        0
::/0                              40     1     6131
::ffff:0.0.0.0/96                 35     4        0
2002::/16                         30     2        0
2001::/32                          5     5        0
fc00::/7                           3    13        0
::/96                              1     3        0
fec0::/10                          1    11        0
3ffe::/16                          1    12        0

That is the policy that is configured with ipv6_prefer. The selection goes as follows. Suppose the DNS resolution gave you two candidate IP addresses, and for the sake of our example, let’s suppose it’s 1.2.3.4 and 2001:aaaa:bbbb::1.

For each IP, as with a routing table, the longest prefix (i.e. the most specific match) wins. So for 1.2.3.4, it translates to the IPv4 mapped address ::ffff:1.2.3.4. Hence it selects the line ::ffff:0.0.0.0/96 with precedence 35.

Similarly for 2001:aaaa:bbbb::1, the longest matching prefix is ::/0 hence it is selected with precedence 40. Note that it’s not 2001::/32, which with zero expansion in the network prefix is really 2001:0000::/32.

Between those two:

::ffff:0.0.0.0/96 with precedence 35
::/0              with precedence 40

it will choose the candidate that matched the line with higher precedence, 40 > 35 so the address that matched ::/0 will be retained. So 2001:aaaa:bbbb::1 is selected to reach the destination.

That still doesn’t explain why IPv4 is preferred on host B.

Notice that there are several other lines below ::ffff:0.0.0.0/96. What are those?

  • 2002::/16: 6to4 (RFC 3056, deprecated by RFC 7526 in 2015). Embeds a public IPv4 address into an IPv6 prefix to tunnel IPv6 over IPv4 without explicit configuration. Seldom used nowadays.
  • 2001::/32: Teredo (RFC 4380). IPv6 tunnelling over IPv4 NAT via UDP encapsulation. Transitional and seldom used nowadays.
  • fc00::/7: Unique Local Addresses (ULA) (RFC 4193). The IPv6 counterpart to RFC 1918 private address space. In practice, only fd00::/8 is used. ULA addresses are not globally routable, but are perfectly valid for local routing.
  • ::/96: IPv4-compatible addresses (deprecated by RFC 4291). Obsolete dual-stack mechanism embedding an IPv4 address in the low 32 bits. Deprecated in 2006.
  • fec0::/10: Site-local addresses (deprecated by RFC 3879 in 2004).
  • 3ffe::/16: 6bone test addresses (RFC 2471, deprecated by RFC 3701 in 2006). Address space used by the now-defunct IPv6 testing network. Any traffic on this prefix is misconfigured or relic traffic.

In our cross-site networks, we use local addresses (fd00::/8) a lot. So if one of the candidates is a ULA, it must be preferred over IPv4. This is not the case if you look at the precedence for IPv4 mapped addresses (35) vs ULA addresses (3).

A quick read in /etc/rc.d/ip6addrctl shows that it will load a custom IP selection policy from /etc/ip6addrctl.conf when in /etc/rc.conf you have ip6addrctl_policy=AUTO and the configuration file is present, readable, and non-empty.

Hence the new configuration (note: ::ffff:0.0.0.0/96 and ::ffff:0:0/96 are equivalent notations for the same prefix):

::1/128		 50	 0
::/0		 40	 1
::ffff:0:0/96	 35	 4
2002::/16	 30	 2
2001::/32	 5	 5
fc00::/7	 37	13
::/96		 1	 3
fec0::/10	 1	11
3ffe::/16	 1	12

But that would only allow preferring ULA over IPv4 mapped addresses. In our case the destination address, 2001:aaaa:bbbb::1 is globally routable and totally not a ULA, so the default configuration should work. Or did we miss something?

The astute reader might have noticed there is also a Label column. Also, the more knowledgeable reader might point out that all this IPv6 selection mechanism is described by RFC 6724. Here is what it has to say about the policy table, precedence, label, and selection algorithm:

   The policy table is a longest-matching-prefix lookup table, much like
   a routing table.  Given an address A, a lookup in the policy table
   produces two values: a precedence value denoted Precedence(A) and a
   classification or label denoted Label(A).

   The precedence value Precedence(A) is used for sorting destination
   addresses.  If Precedence(A) > Precedence(B), we say that address A
   has higher precedence than address B, meaning that our algorithm will
   prefer to sort destination address A before destination address B.

   The label value Label(A) allows for policies that prefer a particular
   source address prefix for use with a destination address prefix.  The
   algorithms prefer to use a source address S with a destination
   address D if Label(S) = Label(D).

and again with more details in Section 6. Destination Address Selection:

   Rule 5: Prefer matching label.
   If Label(Source(DA)) = Label(DA) and Label(Source(DB)) <> Label(DB),
   then prefer DA.  Similarly, if Label(Source(DA)) <> Label(DA) and
   Label(Source(DB)) = Label(DB), then prefer DB.

   Rule 6: Prefer higher precedence.
   If Precedence(DA) > Precedence(DB), then prefer DA.  Similarly, if
   Precedence(DA) < Precedence(DB), then prefer DB.

In our case, it's Rule 5 that was causing IPv4 to be preferred. See, on host B, the route table states that in order to reach 2001:aaaa:bbbb::1, you must go via a specific interface. That interface only has one ULA configured, so it is selected as the source address to reach that destination.

At the input of the destination address selection algorithm, we have those two candidates:

# source-address     destination-address

## IPv4 mapped candidate (from the A record)
::ffff:192.168.1.1   ::ffff:1.2.3.4

## IPv6 candidate (from the AAAA record)
fd08:aaaa:bbbb::1    2001:aaaa:bbbb::1

For the IPv4 mapped candidate, the source and destination label match (4). For the IPv6 candidate, the source and destination label don't match (13 != 1). Hence, by rule 5, the first candidate is preferred.

So the solution is to ensure both the ULA and globally routable lines of the policy share the same label:

::1/128		 50	 0
::/0		 40	 1
::ffff:0:0/96	 35	 4
2002::/16	 30	 2
2001::/32	 5	 5
fc00::/7	 37	 1
::/96		 1	 3
fec0::/10	 1	11
3ffe::/16	 1	12

Note that the problem of ULA being disfavored is explicitly acknowledged in Section 10.6. Configuring ULA Preference of the RFC. I quote:

   [...] By default, global IPv6 destinations are preferred over
   ULA destinations, since an arbitrary ULA is not necessarily
   reachable.

   [...]

   However, a site-specific policy entry can be used to cause ULAs
   within a site to be preferred over global addresses [...].

When you work with ULAs and globally routable addresses in cross-site networks, the prefixes used are generally known in advance and static. The recommended way is to add dedicated policies for those prefixes with higher precedence and ensure that the label matches if those ULAs can also be used as source addresses to reach your own globally routable IPs:

# custom policies for our network
fd08:aaaa:bbbb::/48  42	 14
2001:aaaa:bbbb::/64  41  14
2001:aaaa:cccc::/64  41  14

# default automatic policy with IPv6 prefer
::1/128		     50	 0
::/0		     40	 1
::ffff:0:0/96	     35	 4
2002::/16	     30	 2
2001::/32	     5	 5
fc00::/7	     3	13
::/96		     1	 3
fec0::/10	     1	11
3ffe::/16	     1	12

Here, our ULA prefix and our globally routable prefixes are all assigned the same label (14), ensuring that rule 5 (prefer matching label) never penalizes a ULA source address when reaching one of our own globally routable destinations, and vice versa. They are also given higher precedence than the default ::/0 and ::ffff:0.0.0.0/96 entries, so our known prefixes are always preferred over the generic fallback behavior. For any address outside these explicitly listed prefixes, the default policy applies unchanged.

This was for FreeBSD, though. What about Linux, I hear you ask? There, this is configured in /etc/gai.conf. The syntax changes slightly, but surely with the explanation above you will figure it out.

awscli v2.34.3 on FreeBSD 15.0: _awscrt shenanigans

I had been running AWS CLI v2 on FreeBSD for a while, and it stopped working after I updated to the latest release. I checked out v2.34.3 from the awscli v2 repo, installed the dependencies from requirements.txt, making sure there were no conflicts with locally installed packages — in particular awscrt, where I used the latest available version as specified in the requirements. The aws command itself ran fine:

» aws --version
aws-cli/2.34.3 Python/3.11.14 FreeBSD/15.0-RELEASE-p2 source/amd64

But as soon as it needed to do anything real, it fell apart:

» aws configure sso

aws: [ERROR]: module '_awscrt' has no attribute 'set_log_level'

A bit of digging showed awscrt/io.py calling _awscrt.set_log_level(log_level).

For some context, the awscrt package is the AWS Common Runtime binding for Python, and it’s split into two parts. The pure Python layer lives in the awscrt package and handles the high-level API. The actual work is done by _awscrt, a C extension compiled as a library (a .so file) which on a typical FreeBSD + Python 3.11 setup would live at /usr/local/lib/python3.11/site-packages/_awscrt.abi3.so.

I had manually installed awscrt==0.31.2, the latest release at the time of writing, and also exactly the version pinned in awscli’s requirements.txt. Still, the error persisted. The Python wrapper was clearly calling something the .so didn’t expose.

The clue came from checking where Python was actually loading things from:

» python3 -c "import awscrt; print(awscrt.__version__, awscrt.__file__)"
0.31.2 /usr/local/lib/python3.11/site-packages/awscrt/__init__.py
» python3 -c "import _awscrt; print(_awscrt.__file__)"
/home/myuser/.local/lib/python3.11/site-packages/_awscrt.abi3.so

There it is. The Python part of awscrt was correctly loading from the system-wide site-packages. But _awscrt, the compiled extension, was being pulled from my user’s local site-packages. Probably a leftover from a previous install that I hadn’t properly cleaned up.

After removing the dangling file from ~/.local/lib/python3.11/site-packages/, the check came back clean:

» python3 -c "import _awscrt; print(_awscrt.__file__)"
/usr/local/lib/python3.11/site-packages/_awscrt.abi3.so

And with that, aws configure sso ran without complaint.

Avoid rebuilding rust for FreeBSD ports

Today I tried to rebuild a rust-based port on FreeBSD. It tried to build lang/rust from scratch even though it was already installed. The problem was that the latest binary package for rust was 1.86.0 and the latest version in the ports was 1.87.0. Digging in /usr/ports/Mk/Uses/cargo.mk there is:

CARGO_BUILDDEP?=··yes
.  if ${CARGO_BUILDDEP:tl} == "yes"
BUILD_DEPENDS+=·${RUST_DEFAULT}>=1.87.0:lang/${RUST_DEFAULT}
.  elif ${CARGO_BUILDDEP:tl} == "any-version"
BUILD_DEPENDS+=·${RUST_DEFAULT}>=0:lang/${RUST_DEFAULT}
.  endif

That’s the bit actually enforcing the build dependency. But as you can see, it’s easy to bypass this dependency with export CARGO_BUILDDEP=no. Just ensure that you have rust installed either with rustup or from the packages.

FreeBSD modules not loading correctly on ARM64

After upgrading to FreeBSD 14.2, I encountered a perplexing issue with kernel modules built from ports. They would load and show up in kldstat, but no message nor sysctl node would be created. In fact, it was as if the event_handler would not be called at all, yet it compiled and loaded successfully. On the other hand, modules shipped with the kernel and already compiled were working as intended. To investigate, I built a small test module:

#include <sys/param.h>
#include <sys/module.h>
#include <sys/kernel.h>
#include <sys/systm.h>

static int test_event_handler(module_t mod, int event, void *arg) {
    printf("Test module loaded, event: %d\n", event);
    return (0);
}

static moduledata_t test_mod = {
    "testmodule",
    test_event_handler,
    NULL
};

DECLARE_MODULE(testmodule, test_mod, SI_SUB_LAST, SI_ORDER_ANY);
MODULE_VERSION(testmodule, 1);

On FreeBSD 14.2 amd64, it would load and show the message in the log, whereas on FreeBSD 14.2 arm64, it would load but with no output. Yet, disassembling the module, the event_handler code was just there.

After some investigation, I found out that while the source were compiled with /usr/bin/cc (llvm clang), they were linked with /usr/local/bin/ld (GNU binutils ld). Uninstalling binutils and compiling again with both /usr/bin/cc and /usr/bin/ld, the module would load and show in the log. Why it only appeared with FreeBSD 14.2, however, is still a mystery.

Install awscli2 on FreeBSD 14.1-p3

Whilst I often use the awscli on Linux or mac for which there are binary installers, I also tend to work on personal projects on FreeBSD. Unfortunately this OS is not supported by AWS. Some would recommend to use the Linuxulator but surely running a native version would be better. So here is a quick step-through of how I got it running on FreeBSD 14.1-p3 with python311 thanks mostly to this github issue and also this one. Although this is done with python311 and py311-pip, you can probably use the same method for older version of python down to python39.

First we need to clone awscli2 from https://github.com/aws/aws-cli. Be careful to select the v2 branch.

git clone https://github.com/aws/aws-cli.git
cd aws-cli
git checkout v2

Then, I ran into a problem with pyOpenSSL. After installing it, executing awscli returned the following error message: ERR_UNABLE_TO_GET_ISSUER_CERT = _lib.X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT. Thanks to this other issue, it seems like a quick solution was to downgrade the version of pyOpenSSL (but this step might not be necessary to you, future reader blessed with a fix):

sudo pip install pyOpenSSL==23.1.0

Then it’s a matter of installing the requirements and building/installing the command line tool (note that it needs gcc tho, in my case gcc13-13.3.0):

sudo pip install -r requirements.txt
sudo CC=/usr/local/bin/gcc pip install -e .

And tada!

» aws --version
aws-cli/2.17.49 Python/3.11.9 FreeBSD/14.1-RELEASE-p3 source/amd64

Port configure fails on ARM64

On FreeBSD, if you are trying to build a port but it fails at the configure step with a message similar to this:

checking build system type... Invalid configuration `arm64-portbld-freebsd13.2': machine `arm64-portbld' not recognized
configure: error: /bin/sh ./build-aux/config.sub arm64-portbld-freebsd13.2 failed

Here’s a quick-fix that might work for you:

export CONFIGURE_TARGET=arm64-unknown-freebsd13.2
make install

This is similar to passing --host arm64-unknown-freebsd13.2 to the configure script instead of trying to guess it.

Devd doesn’t trigger LINK_UP

On FreeBSD, you can use devd to trigger scripts that react to device state changes. For instance, you plug/remove a device, or you connect/disconnect an Ethernet cable.

I had to use this kind of rule to restart a service when an interface is reconnected. However the rule would not trigger when the cable was reconnected.

The reason was that default rules in /etc/devd.conf were failing, hence stopping the execution of the next rules. In particular service dhclient quietstart $subsystem".

The solution was either to comment these lines in devd.conf or give my custom devd configuration a higher priority.

Override rc order in FreeBSD

In FreeBSD as in most other Operating Systems, the boot process consist of starting a set of scripts/services/daemons/processes. Each of those has constraints like depending-on or starting before other scripts for instance.

On a default FreeBSD install, this order would be determined by the packages you install, each of them installing a script in /usr/local/etc/rc.d that specifies its constraints requirements.

What, however, if you wanted to change the order of the boot process? For instance, you have a script that by default starts just after the network is ready, but in your case, it specifically has to start after another script for everything to work properly.

Well, I was confronted to that particular problem, and the answer is cross-dependency-scripts or whatever you want to call them.
Suppose that you have the following scripts in your boot process: A, B, C, D. By default, B, C and D start just after A. But you want to change that so B starts after D and C starts after B.

If you changed the order dependency in script B and C directly, that change would be overwritten on the next package update. Instead we add two empty scripts __B and __C that will just enforce the dependence. That is, __B starts after D and before B, __C starts after B and before C.

Looking at the code, at the beginning of the original scripts you would find:

-- rc.d/A
#!/bin/sh

# PROVIDE: A
-- rc.d/B
#!/bin/sh

# PROVIDE: B
# REQUIRE: A
-- rc.d/C
#!/bin/sh

# PROVIDE: C
# REQUIRE: A
-- rc.d/D
#!/bin/sh

# PROVIDE: D
# REQUIRE: A

Thus you would add two scripts __B and __D that contains:

-- rc.d/__B
#!/bin/sh

# PROVIDE: __B
# REQUIRE: D
# BEFORE: B
-- rc.d/__C
#!/bin/sh

# PROVIDE: __C
# REQUIRE: B
# BEFORE: C