Last updated 2016/03/22
Return to the Suffield Academy Network Documentation Homepage
A printable version of this document is also available.
Like many networks, we use a private IPv4 space on our LAN selected from the RFC 1918 range. In order to translate between the private address and our public routable addresses, we use a router capable of network address translation (NAT).
We also use this device to perform filtering services (firewall), both in the context of NAT and as protection for internet-facing hosts.
Traditionally, we had our firewalls in an "on-path" deployment, where all traffic from our lan equipment passed through our firewalls on the way to our ISP router. This made the firewalls a single point of failure and choke point for all traffic.
This document describes our current setup, using an "off-path" deployment so that our core networking equipment handles all of the traffic and only forwards the traffic that needs to be NATed or filtered to the firewalls. This cuts down on the work the firewalls have to do, and also allows us to deploy redundant firewalls without a lot of effort or expense.
Previously, we used OpenBSD and its PF packet filter. While still our platform of choice, we found it increasingly difficult to filter high volumes of traffic due to OpenBSD's single-threaded kernel architecture. We've since moved to FreeBSD, which still provides the basic features we used (including PF), but adds a multi-processor-aware system.
Suffield has a primary connection to the internet through the Connecticut Education Network (CEN). We are provided a /23 of routed addresses from CEN. Suffield's internal LAN uses RFC 1918 private addressing, and all of our machines live in this space.
We have two firewall machines: Ash and Hook. Each has its own 10-gigabit connection to our core switch, and a crossover connection to the other machine.
The crossover connection allows the machines to share firewall state
(via pfsync
). Otherwise, the two machines are independent and do
not rely on each other. There is no "active" or "master" machine;
both firewalls process traffic when they are functioning.
The core switch is configured to route traffic to both firewalls. It uses Bi-directional Forwarding Detection (BFD) to determine if a firewall host is "up" and ready to process traffic. We run a BFD daemon on each firewall, and by signalling it up or down we can take the firewall in and out of service without reconfiguring the core.
FreeBSD runs on most off-the-shelf hardware, and has good driver support for most decent hardware.
As these machines are firewalls, we want the best packet handling abilities we can get. With modern 10 gigabit cards and multiprocessor CPUs, the system should take advantage of multiple cores and spread packet processing out over the cores. Thus, go with a well-supported NIC and a multi-core machine. We use three NICs total: one for management, one for pfsync, and one for "transit" (PF filtering). The transit interface handles the bulk of the traffic, and so is a 10gb card. The others handle a relatively low amount of traffic, so we use the onboard 1gb cards.
Hard drive space is almost negligible (less than 3GB for the OS install, plus space for swap). PF keeps everything locked in RAM, so speed of the drive is not a huge issue. We did not opt for RAID, because we have two fully redundant machines (which would make a redundant drive, well, redundant).
Certain BIOS options are necessary to get the best performance out of the system for high packet rates.
Turn off processor frequency scaling (speedstep), and disable enhanced halt state (S1). These options save power on lightly-loaded systems, but the additional ramp-up/wake time kills latency. In our BIOS, the options (and their correct values) were labelled as:
C1E: Disable C6 state: Disable Speedstep: Disable
Similarly, hyperthreading can cause slowdowns if the OS schedules similar high-rate threads to the same core. However, be sure to leave standard SMP enabled (you want as many cores as possible, just not hyperthreading).
Ensure PCIe is configured correctly (PCIe Gen1/2) and the card is in the correct type of slot (number of lanes).
After you've booted the system, you can confirm which cards are installed using:
pciconf -lc
Also, some cards (such as the Myricom 10gb cards we use) will report
their configured state via sysctl
:
sysctl dev.mxge | grep pcie_link_width
Other performance tweaks include disabling Large Receive Offload (LRO) and TCP Segmentation Offload (TSO), but we do this in software when we configure the NIC; there is no BIOS setting for these items.
Grab an ISO or USB boot stick from:
https://www.freebsd.org/where.html
There aren't a lot of options for the installation, so follow along with the prompts. A few notes:
ports
so you have access to additional
software. You don't need games, X11, or other software.
Once the install is complete, you should be logged in as root
and
able to reach the internet from the machine.
Before configuring the box, we'll need to download a few other
utilities that we use on the machine. The pkg
utility can search
for and install these programs. Note that the first time you run it,
pkg
will bootstrap some of the ports system onto the box for later
use.
You must install one package (subversion
) manually. Then you can
check out our script repository and use it to bootstrap the rest of
the packages.
First, install subversion
:
pkg install -y subversion
Next, check out the files for this host from version control using
subversion
:
svn checkout \ svn://svn.suffieldacademy.org/netadmin/trunk/servers/oaf \ /usr/local/suffield
Now, run the bootstrap script to get the rest of the necessary packages:
/usr/local/suffield/bin/bootstrap-packages
We have several configuration files that define interfaces on the machine, sysctl settings, bootloader tunables, etc. All are kept in version control. You've already checked out the version control directory in the previous section, under:
/usr/local/suffield
All of the configuration files are kept under that directory. Next,
the configs must be linked into the correct places in the filesystem
(for example, /boot
or /etc
). We have a script that automates
this process:
/usr/local/suffield/bin/bootstrap-config
For our Myricom cards, go fetch the FreeBSD Myricom Toolkit:
https://www.myricom.com/support/downloads/myri-10g-toolkit.html
You can run the check to make sure the firmware is up-to-date, and upgrade if necessary.
Our Juniper QFX ships with hardware flow control disabled by default, but FreeBSD enables flow control by default. To enable it in JUNOS, try:
set interfaces xe-0/0/49 ether-options flow-control
In our testing, this didn't improve forwarding directly, but it did allow the firewall to pause inbound traffic when it was overwhelmed. This meant that instead of accepting (say) 900kpps and only forwarding 200kpps, it would only accept (and forward) 200kpps. This kept the kernel from becoming completely overwhelmed in a DoS situation (packets are still dropped, but by the switch instead of the kernel).
At this point, all the configuration files should be in place for a complete install (including those for our firewall ruleset). Reboot the server:
shutdown -r now
To enable all the new interfaces and startup config options.
In order for traffic to get to the firewalls, the core switch must be configured to forward packets to them.
In a basic firewall setup, one typically makes the firewall the next-hop (default gateway) from the LAN switch. All traffic not destined for the LAN automatically flows to the firewall, where it NATs and passes the traffic.
In our setup, we've set up multiple routes to our multiple firewalls. The core switch can select which firewall to forward packets to, including load-balancing between them.
This section describes the steps necessary to configure Juniper switches to correctly route packets to the firewalls, and also discusses setting up BFD for link failure detection.
Internet routing is usually done based on the destination address of a packet (locating the next-hop). In our case, we're going to route some packets based on other information (like the source address). This is known by a few names; Juniper calls it filter-based forwarding (FBF) and Cisco calls it policy-based routing (PBR). As we use a Juniper QFX5100 for our core, we'll only cover configuration for the Juniper EX platform.
The first step is to set up a virtual routing instance where we can apply non-default routing rules.
jhealy@qfx> show configuration routing-instances wan-oaf description "One-Armed Firewall (OAF) VRF for off-path firewalling"; instance-type virtual-router; interface xe-0/9/1.0; # interface of first firewall interface xe-0/9/2.0; # interface of second firewall routing-options { rib wan-oaf.inet.0 { static { route 0.0.0.0/0 { qualified-next-hop 192.168.9.1; # first firewall qualified-next-hop 192.168.9.2; # second firewall bfd-liveness-detection { minimum-interval 500; } } } } rib wan-oaf.inet6.0 { ## Note, as of 2016-03-25, Junpier does NOT ## support FBF in IPv6 on the QFX!! static { route ::/0 { qualified-next-hop 2001:db8:9::1; # first firewall qualified-next-hop 2001:db8:9::2; # second firewall bfd-liveness-detection { minimum-interval 500; } } } } }
When packets are sent to the routing instance above, they use the routing tables defined in the instance to forward the packets. In this case, the tables say to forward all packets to the firewalls.
The next piece of the puzzle is to get packets into this routing instance so they will get shunted to the firewalls. For this, you must use a firewall filter (hence, "filter-based forwarding"):
filter shunt-to-firewalls { term transit-oaf-needed { then { routing-instance wan-oaf; } } }
That's the simplest rule possible, and could be applied on an
inbound interface from your LAN. It would then turn around and dump
all packets into the wan-oaf
instance. In practice, you'd want to
add extra firewall terms so some packets don't get sent to the
firewall. For example, if you had hosts that didn't need NAT or
firewall processing, you could match and accept them before the rule
that sends the packets to the routing instance. In this way, you can
exempt some traffic.
You'll need a similar rule that matches traffic inbound from the WAN that send the packets to the firewalls for un-NAT and/or state checking.
Finally, you'll need a special rule applied to the firewall interfaces
themselves that takes the packets and puts them in another routing
instance. The reason for this is the wan-oaf
routing table simply
says to forward everything to the firewall. Once the packets leave
the firewall, we don't want to forward them right back to the firewall
again, so they need a way to escape back to standard processing. If
you have a separate instance for WAN or LAN, you can forward the
packets directly to those instances so they can resume normal processing.
To further assist with route selection, we use Bi-directional Forwarding Detection (BFD) on the core switch and firewalls to determine link status. The core switch and firewalls each establish a BFD session and exchange regular messages. If the firewall is no longer available, or if it administratively suspends the BFD session, then the core switch no longer considers it a viable route and doesn't send packets to it.
In this way, we can take a firewall offline simply by suspending its BFD session. We can do this for routine maintenance, config changes, or any other situation.
A full discussion of BFD is beyond the scope of this document. Briefly, it is a simple protocol that exchanges state messages on a regular basis between two routers. If too many messages are lost in a short period of time, the link between the routers is presumed to be down. Additionally, a router can signal that the path should not be used by sending special values for its state.
Juniper supports BFD natively, as shown in the configuration above. For FreeBSD, we must install a program that participates in the BFD protocol.
We run a modified copy of Open BFDD. It's the stock version with some patches applied to allow it to compile under FreeBSD 10. This daemon implements basic sessions with other BFD equipment (in this case, our Juniper core switch). We've also created a FreeBSD rc script to assist us in running OpenBFDD.
We've configured OpenBFDD to talk to our core and establish a session. When the firewall process is active, we activate the BFD session. When PF is not running, we terminate the session so the core knows not to send us packets.
We've created init scripts for FreeBSD to control OpenBFDD easily. This section describes some of the typical uses of the command.
The first thing you can do is dump out the configuration for the daemon on this machine:
/usr/local/etc/rc.d/openbfdd check
That will list all of the rc.d variables that are set for openbfdd. We allow multiple instances of bfdd to run at the same time, so the variables are organized by instance. Each instance has a control port and address that it listens on for commands (loopback by default). It also has one or more listeners attached to live interfaces that form the endpoint of a session with the other routing equipment.
To see the states of any configured sessions, run:
/usr/local/etc/rc.d/openbfdd control -- status
That will list the status of all active sessions. If you don't see any, it's possible the daemon hasn't started correctly, or the session was lost. Issue a:
/usr/local/etc/rc.d/openbfdd restart
To re-establish the sessions.
By default, we've configured our sessions to start and then immediately enter the down state. The sessions are active (we're exchanging state messages with the routing equipment), but we're basically telling everyone not to send traffic to us. Once a machine is ready to process traffic (the firewall is configured and running), you can bring up the sessions with:
/usr/local/etc/rc.d/openbfdd up
Conversely, if you need to take a firewall offline (configuration changes), you can put the sessions into a down state:
/usr/local/etc/rc.d/openbfdd down
A status
run will confirm the state of the session. Additionally,
the routing equipment should confirm the change in state as well. For
example, here's the output from the Juniper QFX where the second
firewall is in the "down" state:
jhealy@qfx> show bfd session Detect Transmit Address State Interface Time Interval Multiplier 192.168.9.1 Up xe-0/9/1.0 3.000 1.000 3 192.168.9.2 Init xe-0/9/2.0 6.000 2.000 3 2 sessions, 2 clients Cumulative transmit rate 2.0 pps, cumulative receive rate 2.0 pps
A final note: there is an admindown state as well. Per RFC 5880:
AdminDown state means that the session is being held administratively down. This causes the remote system to enter Down state, and remain there until the local system exits AdminDown state. AdminDown state has no semantic implications for the availability of the forwarding path.
That last bit is important: admin down does not affect the availability of the forwarding path. In other words, if you use this state, the other side may continue to send traffic to you. You're basically saying that BFD is going to be shut off for a bit, but not to change the actual forwarding path. You could use this for maintenance (to the BFD daemon) but don't use it if you want to affect route availability!
Using the examples from the section above, these are the normal set of operations for using OpenBFDD with PF.
Assuming the firewall is processing live traffic and you'd like to make a change to the firewall configuration, take the following steps.
First, mark the BFD sessions as administratively down:
/usr/local/etc/rc.d/openbfdd down
Confirm this on the routing equipment or on the local machine:
/usr/local/etc/rc.d/openbfdd control -- status
You should see the sessions as connected but down. This means no traffic should be flowing to the firewall.
Make any necessary configuration changes on the firewall, typically by checking them out from revision control:
cd /usr/local/suffield svn up
Confirm that the firewall ruleset is sane:
pfctl -nf /usr/local/suffield/common/etc/pf.conf
If there are no errors, you can load the new ruleset:
pfctl -ef /usr/local/suffield/common/etc/pf.conf
Finally, you're ready to accept traffic from the routing equipment again. Bring the BFD sessions back up:
/usr/local/etc/rc.d/openbfdd up
Traffic should begin flowing through the firewall again. of a FreeBSD firewall.
http://bsdrp.net/documentation/technical_docs/performance
http://www.brendangregg.com/USEmethod/use-freebsd.html
# show CPU load per-cpu (so you can tell if RSS/queues are working) top -P # show interrupt stats per irq vmstat -i systat -vmstat # show network traffic systat -ifstat netstat -I transit0 1
Using a fast multi-core Linux box, we were able to generate 1Gb/s of traffic to test our FreeBSD boxes. Note that firewalls tend to show stress based on the number of packets per second (instead of the number of bits per second). The examples below show how to test with small packet sizes (the 192.168.254.1 address is the host to send the packets to; substitute a real hostname):
# send 4 64-byte test packets mausezahn -B 192.168.254.1 -t udp 'sp=5001-5002,dp=5001-5002' \ -P "paddedtestingasciidata" # send a boatload of same UDP packet mausezahn -c 62000000 -B 192.168.254.1 -t udp 'sp=49152,dp=5001' \ -P "paddedtestingasciidata" # send a boatload of a few different types (to exercise hashed routes) mausezahn -c 1000000 -B 192.168.254.1 -t udp 'sp=49152-49160,dp=5001-5009' \ -P "paddedtestingasciidata" # send a flood of unique src/dst UDP ports (stress test state creation on FW) mausezahn -B 192.168.254.1 -t udp 'sp=1024-65535,dp=5001-6200' \ -P "paddedtestingasciidata"