Wifibox: fixing IPv6 connectivity with ULA routing

While FreeBSD’s WiFi stack has been improving steadily, it still lags behind Linux. Wifibox bridges that gap by deploying a Linux guest to drive a wireless card on the FreeBSD host via PCI pass-through — and it works remarkably well.

That said, I ran into a connectivity issue: connections to hosts resolving to both IPv4 and IPv6 addresses were silently failing on the IPv6 path. The culprit was IPv6 address selection.

Wifibox relies on ULA (Unique Local Addresses, fd00::/64) for its internal IPv6 routing. The problem is that FreeBSD’s default ip6addrctl policy doesn’t allow routing to a global unicast address (GUA) when the source address is a ULA — so the connection attempt would fail or fall back unexpectedly.

The fix is to add a policy entry for the ULA prefix and enable the custom policy in /etc/rc.conf:

ip6addrctl_policy="AUTO"

Then in /etc/ip6addrctl.conf:

# enable selecting GUA destination with fd00::/64 as source address
fd00::/64 37 1

# default automatic policy (IPv6 preferred)
::1/128          50  0
::/0             40  1
::ffff:0:0/96    35  4
2002::/16        30  2
2001::/32         5  5
fc00::/7          3 13
::/96             1  3
fec0::/10         1 11
3ffe::/16         1 12

This is a direct follow-up to my previous post IPv6 address selection and cross-site ULAs, which covers the underlying mechanics in more detail.

IPv6 address selection and cross-site ULAs

All I wanted was to update some SSH keys…
But for reasons that I’ll spare you, it turned out to be a tad more complicated.
Also, in what follows, the context will be simplified for clarity.

As to make the very basic context of this post clear, this is on FreeBSD, and it’s all about IPv4 vs IPv6 address selection. What address should I use to reach xyz.com? Chances are that if you are just mainly using your web browser, that never appeared to be a problem. See, browsers use what is called the Happy Eyeballs algorithm, basically whoever IPv4/IPv6 responds first gets used.

But it’s generally not implemented outside browsers. So that’s why your package manager decides to use either IPv6 or IPv4, but that selection is unreachable, so it’s just stuck there. That’s also why sometimes web apps seem to work perfectly fine in your browser, but on your phone, they seem very sluggish at best, and most of the time just clicking on any button or link just makes the app hang in there forever.

Chances are the app didn’t implement the aforementioned algorithm; it had a route to reach the destination, but for some reason it isn’t reachable. So it tries IPv6 and just has to wait for some timeout. Hence why I guess most people just disable IPv6. But they really shouldn’t. They should slam their sysadmin/netadmin/ISP for doing a bad job. Those are the ones slowing down IPv6 adoption, not the users.

So back to the problem at hand.

At some point, I had to reach over the same destination (read the same fqdn) via SSH from two different hosts. That destination DNS resolution presented both a A (IPv4) record, and a AAAA (IPv6) record. However on the first host (let’s call him host A), IPv6 was preferred to reach the destination (good). However, on the second host (let’s call him host B), IPv4 was preferred instead (bad).

Both hosts received the exact same DNS resolution (with IPv4+IPv6 resolve), both have a globally routable IPv6 and an IPv6 route to the destination. Both hosts can ping6 to the actual destination. In other words, the destination is perfectly reachable via IPv6. Why then would A prefer IPv6 and B prefer IPv4?

See in FreeBSD system configuration file (/etc/rc.conf), there is ip6addrctl_policy that can be configured to ipv6_prefer. As the name suggests, with this configuration and when presented with both a IPv4 and IPv6 to reach the destination, the network should choose the later by default. And it was configured to this value on both host A and host B, so in both case it should have chosen IPv6, but it didn’t.

Under the ip6addrctl_policy setting is the ip6addrctl command that actually configures the address selection policy. Let’s see what it has to say:

$ ip6addrctl show
Prefix                          Prec Label      Use
::1/128                           50     0        0
::/0                              40     1     6131
::ffff:0.0.0.0/96                 35     4        0
2002::/16                         30     2        0
2001::/32                          5     5        0
fc00::/7                           3    13        0
::/96                              1     3        0
fec0::/10                          1    11        0
3ffe::/16                          1    12        0

That is the policy that is configured with ipv6_prefer. The selection goes as follows. Suppose the DNS resolution gave you two candidate IP addresses, and for the sake of our example, let’s suppose it’s 1.2.3.4 and 2001:aaaa:bbbb::1.

For each IP, as with a routing table, the longest prefix (i.e. the most specific match) wins. So for 1.2.3.4, it translates to the IPv4 mapped address ::ffff:1.2.3.4. Hence it selects the line ::ffff:0.0.0.0/96 with precedence 35.

Similarly for 2001:aaaa:bbbb::1, the longest matching prefix is ::/0 hence it is selected with precedence 40. Note that it’s not 2001::/32, which with zero expansion in the network prefix is really 2001:0000::/32.

Between those two:

::ffff:0.0.0.0/96 with precedence 35
::/0              with precedence 40

it will choose the candidate that matched the line with higher precedence, 40 > 35 so the address that matched ::/0 will be retained. So 2001:aaaa:bbbb::1 is selected to reach the destination.

That still doesn’t explain why IPv4 is preferred on host B.

Notice that there are several other lines below ::ffff:0.0.0.0/96. What are those?

  • 2002::/16: 6to4 (RFC 3056, deprecated by RFC 7526 in 2015). Embeds a public IPv4 address into an IPv6 prefix to tunnel IPv6 over IPv4 without explicit configuration. Seldom used nowadays.
  • 2001::/32: Teredo (RFC 4380). IPv6 tunnelling over IPv4 NAT via UDP encapsulation. Transitional and seldom used nowadays.
  • fc00::/7: Unique Local Addresses (ULA) (RFC 4193). The IPv6 counterpart to RFC 1918 private address space. In practice, only fd00::/8 is used. ULA addresses are not globally routable, but are perfectly valid for local routing.
  • ::/96: IPv4-compatible addresses (deprecated by RFC 4291). Obsolete dual-stack mechanism embedding an IPv4 address in the low 32 bits. Deprecated in 2006.
  • fec0::/10: Site-local addresses (deprecated by RFC 3879 in 2004).
  • 3ffe::/16: 6bone test addresses (RFC 2471, deprecated by RFC 3701 in 2006). Address space used by the now-defunct IPv6 testing network. Any traffic on this prefix is misconfigured or relic traffic.

In our cross-site networks, we use local addresses (fd00::/8) a lot. So if one of the candidates is a ULA, it must be preferred over IPv4. This is not the case if you look at the precedence for IPv4 mapped addresses (35) vs ULA addresses (3).

A quick read in /etc/rc.d/ip6addrctl shows that it will load a custom IP selection policy from /etc/ip6addrctl.conf when in /etc/rc.conf you have ip6addrctl_policy=AUTO and the configuration file is present, readable, and non-empty.

Hence the new configuration (note: ::ffff:0.0.0.0/96 and ::ffff:0:0/96 are equivalent notations for the same prefix):

::1/128		 50	 0
::/0		 40	 1
::ffff:0:0/96	 35	 4
2002::/16	 30	 2
2001::/32	 5	 5
fc00::/7	 37	13
::/96		 1	 3
fec0::/10	 1	11
3ffe::/16	 1	12

But that would only allow preferring ULA over IPv4 mapped addresses. In our case the destination address, 2001:aaaa:bbbb::1 is globally routable and totally not a ULA, so the default configuration should work. Or did we miss something?

The astute reader might have noticed there is also a Label column. Also, the more knowledgeable reader might point out that all this IPv6 selection mechanism is described by RFC 6724. Here is what it has to say about the policy table, precedence, label, and selection algorithm:

   The policy table is a longest-matching-prefix lookup table, much like
   a routing table.  Given an address A, a lookup in the policy table
   produces two values: a precedence value denoted Precedence(A) and a
   classification or label denoted Label(A).

   The precedence value Precedence(A) is used for sorting destination
   addresses.  If Precedence(A) > Precedence(B), we say that address A
   has higher precedence than address B, meaning that our algorithm will
   prefer to sort destination address A before destination address B.

   The label value Label(A) allows for policies that prefer a particular
   source address prefix for use with a destination address prefix.  The
   algorithms prefer to use a source address S with a destination
   address D if Label(S) = Label(D).

and again with more details in Section 6. Destination Address Selection:

   Rule 5: Prefer matching label.
   If Label(Source(DA)) = Label(DA) and Label(Source(DB)) <> Label(DB),
   then prefer DA.  Similarly, if Label(Source(DA)) <> Label(DA) and
   Label(Source(DB)) = Label(DB), then prefer DB.

   Rule 6: Prefer higher precedence.
   If Precedence(DA) > Precedence(DB), then prefer DA.  Similarly, if
   Precedence(DA) < Precedence(DB), then prefer DB.

In our case, it's Rule 5 that was causing IPv4 to be preferred. See, on host B, the route table states that in order to reach 2001:aaaa:bbbb::1, you must go via a specific interface. That interface only has one ULA configured, so it is selected as the source address to reach that destination.

At the input of the destination address selection algorithm, we have those two candidates:

# source-address     destination-address

## IPv4 mapped candidate (from the A record)
::ffff:192.168.1.1   ::ffff:1.2.3.4

## IPv6 candidate (from the AAAA record)
fd08:aaaa:bbbb::1    2001:aaaa:bbbb::1

For the IPv4 mapped candidate, the source and destination label match (4). For the IPv6 candidate, the source and destination label don't match (13 != 1). Hence, by rule 5, the first candidate is preferred.

So the solution is to ensure both the ULA and globally routable lines of the policy share the same label:

::1/128		 50	 0
::/0		 40	 1
::ffff:0:0/96	 35	 4
2002::/16	 30	 2
2001::/32	 5	 5
fc00::/7	 37	 1
::/96		 1	 3
fec0::/10	 1	11
3ffe::/16	 1	12

Note that the problem of ULA being disfavored is explicitly acknowledged in Section 10.6. Configuring ULA Preference of the RFC. I quote:

   [...] By default, global IPv6 destinations are preferred over
   ULA destinations, since an arbitrary ULA is not necessarily
   reachable.

   [...]

   However, a site-specific policy entry can be used to cause ULAs
   within a site to be preferred over global addresses [...].

When you work with ULAs and globally routable addresses in cross-site networks, the prefixes used are generally known in advance and static. The recommended way is to add dedicated policies for those prefixes with higher precedence and ensure that the label matches if those ULAs can also be used as source addresses to reach your own globally routable IPs:

# custom policies for our network
fd08:aaaa:bbbb::/48  42	 14
2001:aaaa:bbbb::/64  41  14
2001:aaaa:cccc::/64  41  14

# default automatic policy with IPv6 prefer
::1/128		     50	 0
::/0		     40	 1
::ffff:0:0/96	     35	 4
2002::/16	     30	 2
2001::/32	     5	 5
fc00::/7	     3	13
::/96		     1	 3
fec0::/10	     1	11
3ffe::/16	     1	12

Here, our ULA prefix and our globally routable prefixes are all assigned the same label (14), ensuring that rule 5 (prefer matching label) never penalizes a ULA source address when reaching one of our own globally routable destinations, and vice versa. They are also given higher precedence than the default ::/0 and ::ffff:0.0.0.0/96 entries, so our known prefixes are always preferred over the generic fallback behavior. For any address outside these explicitly listed prefixes, the default policy applies unchanged.

This was for FreeBSD, though. What about Linux, I hear you ask? There, this is configured in /etc/gai.conf. The syntax changes slightly, but surely with the explanation above you will figure it out.

Unbound stub-zone for reverse private IPv6

Today I tried to configure a stub-zone on a unbound resolver. This was for the reverse resolution of some private IPv6. In unbound.conf, it looks something like this:

stub-zone:
  name: X.X.X.X.X.X.d.f.ip6.arpa.
  stub-addr: {authoritative-server-ip}

But trying a reverse resolution on any of those private IPv6 failed:

$ drill -x fdXX:XXXX::XXXX
;; AUTHORITY SECTION:
d.f.ip6.arpa.	10800	IN	SOA	localhost. nobody.invalid. 1 3600 1200 604800 10800

Found out the problem in a snippet from unbound.conf.sample:

# By default, for a number of zones a small default 'nothing here'
# reply is built-in.  Query traffic is thus blocked.  If you
# wish to serve such zone you can unblock them by uncommenting one
# of the nodefault statements below.
# You may also have to use domain-insecure: zone to make DNSSEC work,
# unless you have your own trust anchors for this zone.
# local-zone: "localhost." nodefault
# local-zone: "127.in-addr.arpa." nodefault
# local-zone: "1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa." nodefault
# local-zone: "home.arpa." nodefault
# local-zone: "resolver.arpa." nodefault
# local-zone: "service.arpa." nodefault
# local-zone: "onion." nodefault
# local-zone: "test." nodefault
# local-zone: "invalid." nodefault
# local-zone: "10.in-addr.arpa." nodefault
# local-zone: "16.172.in-addr.arpa." nodefault
# local-zone: "17.172.in-addr.arpa." nodefault
# local-zone: "18.172.in-addr.arpa." nodefault
# local-zone: "19.172.in-addr.arpa." nodefault
# local-zone: "20.172.in-addr.arpa." nodefault
# local-zone: "21.172.in-addr.arpa." nodefault
# local-zone: "22.172.in-addr.arpa." nodefault
# local-zone: "23.172.in-addr.arpa." nodefault
# local-zone: "24.172.in-addr.arpa." nodefault
# local-zone: "25.172.in-addr.arpa." nodefault
# local-zone: "26.172.in-addr.arpa." nodefault
# local-zone: "27.172.in-addr.arpa." nodefault
# local-zone: "28.172.in-addr.arpa." nodefault
# local-zone: "29.172.in-addr.arpa." nodefault
# local-zone: "30.172.in-addr.arpa." nodefault
# local-zone: "31.172.in-addr.arpa." nodefault
# local-zone: "168.192.in-addr.arpa." nodefault
# local-zone: "0.in-addr.arpa." nodefault
# local-zone: "254.169.in-addr.arpa." nodefault
# local-zone: "2.0.192.in-addr.arpa." nodefault
# local-zone: "100.51.198.in-addr.arpa." nodefault
# local-zone: "113.0.203.in-addr.arpa." nodefault
# local-zone: "255.255.255.255.in-addr.arpa." nodefault
# local-zone: "0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa." nodefault
# local-zone: "d.f.ip6.arpa." nodefault
# local-zone: "8.e.f.ip6.arpa." nodefault
# local-zone: "9.e.f.ip6.arpa." nodefault
# local-zone: "a.e.f.ip6.arpa." nodefault
# local-zone: "b.e.f.ip6.arpa." nodefault
# local-zone: "8.b.d.0.1.0.0.2.ip6.arpa." nodefault
# And for 64.100.in-addr.arpa. to 127.100.in-addr.arpa.

As you can see d.f.ip6.arpa. is blocked by default, so just had to add this line to unblock it:

local-zone: "d.f.ip6.arpa." nodefault

D-Link DIR AP and IPv6

If you have one of those very common D-Link DIR WiFi router/AP (such as the DIR-605L rev B2), you should know that they don’t get very well with IPv6. At least the vendor firmware doesn’t. Probably most people around here don’t even know and probably don’t even care. But that’s really no good 🙁

Apparently not all IPv6 messages seem to get through. ICMP6 echo-request on link-local addresses are not a problem. However RS messages between Ethernet ports don’t always get through. Here is a little diagram of what goes through and what doesn’t:

This was tested with FreeBSD 11.1-RELEASE-p9 and Debian 9.4.0 and on two different hosts (my trusty ol’ ThinkPad X201 and a homemade station with a Ralink Ethernet card). Still a lot of packets to test though. Don’t know if IPv6 packets themselves go through or not. However RS/RA don’t and that’s sufficient to make IPv6 completely unusable.

I will update the figure above when I know more. Although that may be never, don’t know yet. One solution would be to upgrade the firmware to either DD-WRT or Tomato router. However those are not supporter on the DIR 605L rev B2. Also this was supposed to be a replacement and we already bought a much better alternative.

Filter ICMPv6 with tcpdump

If you want to filter ICMP echo-requests with tcpdump, you can use this command:

tcpdump -i em0 "icmp[0] == 8"

But it doesn’t work if you try the same syntax with ICMPv6:

tcpdump -i em0 "icmp6[0] == 128"
tcpdump: IPv6 upper-layer protocol is not supported by proto[x]

Instead you can parse directly the IPv6 payload. An IPv6 packet is 40 bytes long, and the first 8 bits of the ICMPv6 header specify its type:

tcpdump -i eth0 "icmp6 && ip6[40] == 128"

The most common ICMPv6 types are:

  • unreachable: 1
  • too-big: 2
  • time-exceeded: 3
  • echo-request: 128
  • echo-reply: 129
  • router-solicitation: 133
  • router-advertisement: 134
  • neighbor-solicitation: 135
  • neighbor-advertisement: 136

IPv6 rDNS stub-zone on unbound

We have stub-zones configured on our gateway for reverse IPv6. Our ISP doesn’t delegate rDNS but we still want to lookup addresses (at least on the local side). To do so I configured a stub-zone from unbound, our local caching DNS, to our own rDNS authoritative server. Apparently unbound wants an IPv6 for its IPv6 rDNS queries to the stub-zone. Since IPv6 is not always working I solved that using the local interface. That is, the rDNS authoritative listen on localhost:5353 and unbound uses this as its stub-zone addresses.

On the authoritative (here NSD):

 ip-address: ::1@5353
 ip-address: 127.0.0.1@5353
 do-ip6: yes

On unbound, note that we need do-not-query-localhost: no to allow queries on localhost:

 do-not-query-localhost: no

stub-zone:
 name: 0.0.0.0.0.0.0.0.8.b.d.0.1.0.0.2.ip6.arpa.
 stub-addr: ::1@5353
 stub-addr: 127.0.0.1@5353

Back in v6

After quite a long time we are now back in the v6 world thanks to Hurricane Electric.
We lost our IPv6 connectivity when migrating our VPS from OpenVZ to KVM. There is no IPv6 on the newer OVH VPS 2016 although we had one on the older version. I don’t know why it is and when asked via a ticket they assured me that it would be available soon. This was months ago, still no IPv6, and I am not alone. This is becoming long and really awkward for OVH, supposed to be the 3rd hosting provider in the world.

Setting up the tunnel with HE was painless. You just configure a simple 6in4 (gif on BSD / sit on Linux) tunnel from your IPv4 to the endpoint they provide to you and your are done!

In the meantime we also configured the IPv6 prefix that we received from our ISP. I used ndppd, an NDP proxy daemon, so that our ISP modem believes that the IPv6 hosts are located on the same link and not one or more hop away (as they really are indeed, there is an intermediate router between our LAN and the modem). So we don’t need SixXS anymore which is great!

IPv6 representation in NSD

I use NSD as my authoritative DNS. Today I made a mistake while entering new AAAA records. Actually it was a bug in one of the scripts we use to manage our zones. Take the following representation, 2001:aaaa:bbbb:cccc::1111:2222:3333:4444, it is not a valid representation for an IPv6 address. According to the RFC4291, the double-colon symbol (::) indicates one or more groups of 16 bits of zeros. Since 8 groups of 16 bits are explicitly written in the preceding representation, the 128 bits of the IPv6 address are already detailed so there can be no extra group of zeros. The reason I made this mistake is that if you often work with /48 prefixes and you don’t care about your 65k subnets, all your IPv6 addresses will have zeros between the prefix and the interface ID.

Most programs will detect this as invalid, probably because they use inet_pton() to translate the address into its binary representation. Some others, such as ipv6calc parse the address manually and do not consider this as a buggy representation. In the latter case, the group is considered empty, and the double-colon implicitly translated to a single colon.

If you use this kind of invalid representation in a AAAA record, NSD will not reload your zone correctly, but on the other hand, neither will it complain and the queries will be made on a database that still contains your old zone. Of course this would not happen if you used nsd-checkzone to check your zone before reloading it.