Introduction

As I explained in an earlier blog post, I am attempting to convert our campus to being IPv6-native, limiting our address families to one (IPv6) instead of running dual-stack (IPv4/v6). However, the IPv4 internet is still out there, and that means we need a way to communicate with legacy hosts that only have an IPv4 address.

This post describes how we set up this service for our campus. If you’re new to NAT64, you might want to check out the background, software, and basic Jool sections. If you’re familiar with that and want to see what we did, head on down to a description of our setup.

Background on NAT64

Because IPv6 addresses are much larger than IPv4 addresses, it is possible to embed every possible IPv4 addresses inside an IPv6 prefix. Thus, IPv6 can be used as the transport for IPv4, even if a host doesn’t have an IPv4 address.

By choosing a special IPv6 prefix that is /96 or larger, we can make it a target for IPv4-embedded traffic. Then, we forward all this traffic to a special machine: a NAT64 gateway. This machine de-embeds the IPv4 destination address, converts the packet to IPv4, and sends it to its intended destination. The gateway needs to be dual-stack, but can provide its service to an entire network of IPv6-only hosts.

There’s one other important step in this process: in order to get the v6 hosts to use these embedded addresses we need to use DNS64. This is a special DNS resolver that returns the embedded addresses in answers to domain name queries. I won’t get into the details of that here, but many DNS servers have the ability to perform this DNS64 operation, including large public resolvers.

Once this is all set up, the IPv6 clients “just work”. They query DNS and get back AAAA v6 records (they contain the embedded v4 address, but they are valid v6 addresses). The v6 host sends packets to these addresses, which are routed to the NAT64 gateway, and transparently translated to their v4 destination. The clients don’t need to know what’s going on to perform this translation, and can communicate using only IPv6.

Though NAT64 requires no specific client support, some clients will detect when it is enabled and provide additional functionality.

NAT64 Gateway Software

We are fans of OpenBSD’s PF, which does support a af-to operator to translate between address families. After a little messing around we were able to get basic NAT64 up and running with one line of config:

pass in on ix1 inet6 to 64:ff9b::/96 af-to inet from 192.0.2.1

Unfortunately, the reverse direction (aka NAT46) wasn’t quite what we were looking for. We wanted to be able to accept inbound IPv4 connections and pass them to an IPv6-only server. OpenBSD does this, but it masks the source address (relying on PF’s state table to track connections). For outbound connections, this is fine, but we wanted to embed the source v4 address in the NATed address so a host would know where it was coming from (for logging and other analysis). Yes, the firewall could log this, but that meant an extra step instead of just looking on the server itself.

So we decided to try Jool on Linux. It was a bit more to set up and get going, but once we got into it we found that it met our requirements:

Basic Jool

In a nutshell, Jool needs two bits of configuration to get up and running. First, you have to configure Jool itself, either using command-line tools or a JSON config file. Second, you must route packets to Jool using some netfilter/iptables rules.

Jool makes a distinction between stateless (SIIT) translation, which is intended for 1:1 NAT, and stateful translation, where a pool of addresses is shared using different port numbers. Each of these translators are loaded as kernel modules and configured with different top-level commands/configs.

The Jool documentation generally does a good job of explaining the topics, and has a good run-through of how NAT64 operates (with examples and pictures). The organization has some rough spots (for example, the session synchronization instructions are buried under “Other Sample Runs”), but if you go through all the docs you can find what you’re looking for.

Our Jool Setup

We committed to Jool and decided to set up a high-availability cluster of two NAT64 gateways with the option to horizontally scale to additional machines. Once rolled out, this will be a vital service (providing all connectivity to the v4 internet), so we wanted to make sure it had both throughput and fault tolerance.

We have a git project that contains detailed information for setting up our Jool cluster. While Suffield-specific, it does serve as a fully-worked example if you want to see how we did it.

I wanted to call out a few features of our setup as they aren’t part of the standard build. Most of these are independent of each other, so you can try out ones that sound interesting to you.

Separate Management and Data Planes

To make the routing rules as simple as possible, we sequester the NAT64 interfaces from the rest of the system. We use Linux’s network namespaces to accomplish this; using ip netns we can create namespaces that are logically separate from one another and do not share routing information.

Jool explicitly supports running in multiple namespaces, so this helps us cleanly separate traffic. It also allows us to run multiple instances.

Multiple Instances

We’re performing a type of poor-man’s load balancing where we direct traffic to one of two physical boxes based on a simple route preference. However, because we also want failover, both boxes need to be configured to handle all the traffic in the event that the other box goes down. To ensure that traffic is sticky to one host or the other, we logically separate the instances so they are independent of each other and use different pools of v4 addresses.

Jool supports multiple instances (distinct from the namespace support above), and we use that feature to hold multiple jool instances that have different pools of addresses. We then use iptables rules to route specific traffic to a specific instance for processing.

In our case, we’ve divided the internet in half; packets destined to IPs 0.0.0.0 through 127.255.255.255 go to one instance (“lower”) and traffic destined to IPs 128.0.0.0 through 255.255.255.255 go to another (“upper”). Meanwhile, SIIT traffic goes to a special third instance that only handles stateless traffic. Obviously, this type of load balancing isn’t guaranteed to be evenly distributed, but it’s easy to accomplish with static rules and we can always adjust it later if we find hot spots.

HSR

(Note: we are no longer deploying HSR with Jool. Jool uses multicast packets for session sync, and on HSR that causes multiple copies of a packet to arrive on the host (one from each side of the ring). That adds overhead and also makes Jool’s error counters go a bit crazy (it doesn’t like the duplicates). I’m leaving the description below but be aware of these issues.)

Because we’re synchronizing states between gateways, we need a reliable out-of-band network to exchange state data.

We use a special ethernet protocol called High-availability seamless redundancy, or HSR. In short, this protocol is built for “ring” networks, where a redundant path between nodes always exists. Unlike spanning tree, all directions on the ring are active at all times, and the hosts must drop duplicate frames. This speeds recovery time as ports do not need to wait to unblock when there is a failure.

Additionally, this makes scaling up a cluster of machines easier. Each node needs two links to a neighbor node. No central switch is needed (eliminating a failure point), and we can build as large of a topology as we need by inserting more nodes into the ring. If you’ve ever stacked network switches this should sound familiar to you as they typically use a ring for their inter-chassis link.

HSR is not in the Debian kernel by default, but is relatively easy to build and load.

BFD

We use static routes to forward traffic to the NAT64 gateways. To assist with failover, we use bidirectional forwarding detection, a simple “hello” protocol that helps the router detect failures quickly. On Linux, Free Range Routing (FRR) understands BFD, and we can use systemd dependencies to ensure that BFD stops advertising if jool goes down.

Tooling

As you can imagine, the features above add a fair amount of complexity to the configuration. Our git project contains scripts and systemd templates to help automate the process of getting everything configured and launched.

We also wrote a quick sa-jool perl script that allows you to run regular jool commands against all the instances so you don’t have to type everything multiple times. It also has some quick health checks so you don’t have to memorize the entire architecture to see if it’s running correctly.

Final Thoughts

I’ve been testing the NAT64 setup for several months now. I’ve manually set my laptop to use a DNS64 server, and completely disabled IPv4 on my machine. From the NAT64 perspective, things have been going very well and the boxes have performed nicely.

The testing process has had some bumps (as I quickly discovered which hosts on my local network were v4-only), but that’s certainly not NAT64’s fault.

More recently I found out that macos automatically enables IPv4 CLAT functionality when NAT64 is detected, allowing a machine that only has IPv6 connectivity to still connect to IPv4 literal addresses. So far, things are looking promising to deprecate IPv4!