Ubiquiti UDM running FRR BGP

BGP is the routing protocol of the Internet, and using it locally in your own network for things like DNS and other stateless service redundancy is fantastic.

Ubiquiti UDM running FRR BGP

I’ve successfully configured my Ubiquiti UDM Pro to utilise Border Gateway Protocol for internal high availability of AnyCast DNS along with reverse proxy communication.

BGP is wonderful for this job as DNS clients do a horrible job at failing over to secondary servers, plus clustering reverse proxies with things like "heartbeat" adds significant complexity and monitoring requirements at the host side.

I guess a routing protocol like OSPF could also be used instead of BGP, although I've not tried it. Something like http://ddiguru.com/blog/anycast-dns-part-4-using-ospf-basic

I stuck with an old USG Pro 4 for ages because it ran Edge-OS, which included the ability to use a custom config.gateway.json on the Unifi Controller site to define BGP and neighbors. But I wanted a UDM Pro so that I could enable IDS/IPS and get full line-rate gigabit to my ISP, and not be limited to around 300Mbps when doing this with the USG Pro 4. I didn’t upgrade until I was sure that I could get BGP going on the UDM.

As a side note, migrating from the USG to the UDM was not a totally smooth process. It wouldn’t connect to my ISP at first, Aussie Broadband, but a quick Google told me to “kick” the existing connection in the Aussie BB portal. Magic happened. Migration was as simple as backing up the existing Unifi Controller, and restoring it on the UDM. I did need to fiddle with a few things, like opening every port forwarding rule and clicking “save” to get them to work, but all the rest of my network/VLAN/labels/etc. came across smoothly. Make sure that your UDM network application version is the same or greater than the existing controller.

You will find much hate on the Internet directed at Ubiquiti over the UDM. Most of it comes from a lack of understanding, and a lack of available detail, so read up and get smarter. I think the device is bloody brilliant.

Getting BGP working on the UDM is pretty straightforward. It requires the creation of a podman container, which for the un-initiated can be daunting. Luckily for us, others have paved the way on forums, and I was able to follow their excellent work. Light bulbs came from this thread, particularly when I was using a previous UDM firmware release: https://github.com/boostchicken-dev/udm-utilities/issues/289 Earlier, this required a custom kernel in the unifi-os container, and I have described that down below if anyone is interested or needs it.

The “other side” of the BGP neighbor configuration is not exhaustively included in this post, but in short I have three FreeIPA DNS servers that FRR routes traffic to a loopback address that the daemons listen on, which is “192.168.9.9”, and three HAProxy reverse proxies where FRR routes to “192.168.9.10”. A sample configuration is at the end of this post.

Usual Internet caveats apply. This worked for me, but if you break your router it's on you. A factory reset should get it going again, so make sure you have a backup of your UDM.

This all requires you to SSH to the UDM. To configure go to its web interface, select settings, advanced, enable the SSH slider and choose a password. The user name is root. Then SSH into it using your favourite tool.

Install an FRR container using podman

The UDM utilises podman to separate Unifi applications and make them easy to install, and we exploit this to install FRR on the router.

To do so, install boostchicken’s excellent on-boot-script, which will execute shell scripts when the UDM starts, enabling FRR to install and run. See https://github.com/boostchicken-dev/udm-utilities/blob/master/on-boot-script/README.md for details.

curl -fsL "https://raw.githubusercontent.com/boostchicken-dev/udm-utilities/HEAD/on-boot-script/remote_install.sh" | /bin/sh
(All one line... Note that the code blocks on my blog site hide the urls. They're there though. Refresh browser and you'll see them briefly, or copy the text to your clipboard... There's some wonky CSS that I need to fix. Soz.)

Create the configuration script /mnt/data/on_boot.d/10-onboot-frr.sh for on-boot to execute. On-boot will execute in alphabetic order all scripts in the on_boot.d folder.

#!/usr/bin/env sh

DEBUG=${DEBUG:--d}
CONTAINER_NAME="frr"

if podman container exists ${CONTAINER_NAME}; then
  podman start ${CONTAINER_NAME}
else
  podman run --mount="type=bind,source=/mnt/data/$CONTAINER_NAME,destination=/etc/frr/" \
            --name "$CONTAINER_NAME" \
            --network=host \
            --privileged \
            --restart always \
            $DEBUG \
            docker.io/frrouting/frr:v8.1.0
fi

Don’t forget to set execute permissions on the script.

chmod +x /mnt/data/on_boot.d/10-onboot-frr.sh

Configure FRR

Create the FRR configuration directory for the container, and a blank vtysh.conf configuration file to stop vtysh nagging that it doesn't have one.

mkdir /mnt/data/frr
touch /mnt/data/frr/vtysh.conf

Create the /mnt/data/frr/daemons configuration file, enabling BGP.

zebra=no
bgpd=yes
ospfd=no
ospf6d=no
ripd=no
ripngd=no
isisd=no
pimd=no
ldpd=no
nhrpd=no
eigrpd=no
babeld=no
sharpd=no
staticd=no
pbrd=no
bfdd=no
fabricd=no

#
# If this option is set the /etc/init.d/frr script automatically loads
# the config via "vtysh -b" when the servers are started.
# Check /etc/pam.d/frr if you intend to use "vtysh"!
#
vtysh_enable=yes
zebra_options=" -s 90000000 --daemon -A 127.0.0.1"
bgpd_options="   --daemon -A 127.0.0.1"
ospfd_options="  --daemon -A 127.0.0.1"
ospf6d_options=" --daemon -A ::1"
ripd_options="   --daemon -A 127.0.0.1"
ripngd_options=" --daemon -A ::1"
isisd_options="  --daemon -A 127.0.0.1"
pimd_options="  --daemon -A 127.0.0.1"
ldpd_options="  --daemon -A 127.0.0.1"
nhrpd_options="  --daemon -A 127.0.0.1"
eigrpd_options="  --daemon -A 127.0.0.1"
babeld_options="  --daemon -A 127.0.0.1"
sharpd_options="  --daemon -A 127.0.0.1"
staticd_options="  --daemon -A 127.0.0.1"
pbrd_options="  --daemon -A 127.0.0.1"
bfdd_options="  --daemon -A 127.0.0.1"
fabricd_options="  --daemon -A 127.0.0.1"

Create a configuration file for BGP, located at /mnt/data/frr/bgpd.conf. (Note that my UDM's IP address is 192.168.10.1.) The line "maximum-paths 1" is used to stop FRR from trying to add multi-path routes to the UDM, which it does not support. A simple fail-over is used instead. If left out, then a heap of /var/log/messages lines will be produced on the UDM like "Multipath routes not supported, got x nexthops for route".

! -*- bgp -*-
!
hostname $UDMP_HOSTNAME
password zebra
frr defaults traditional
log file stdout
!
router bgp 65510
 bgp ebgp-requires-policy
 bgp router-id 192.168.10.1
 maximum-paths 1
 !
 ! Peer group for DNS
 neighbor DNS peer-group
 neighbor DNS remote-as 65511
 neighbor DNS activate
 neighbor DNS soft-reconfiguration inbound
 neighbor DNS timers 15 45
 neighbor DNS timers connect 15
 !
 ! Peer group for reverse proxy
 neighbor RP peer-group
 neighbor RP remote-as 65512
 neighbor RP activate
 neighbor RP soft-reconfiguration inbound
 neighbor RP timers 15 45
 neighbor RP timers connect 15
 !
 ! Neighbors for DNS
 neighbor 192.168.10.31 peer-group DNS
 neighbor 192.168.10.32 peer-group DNS
 neighbor 192.168.10.33 peer-group DNS
 !
 ! Neighbors for reverse proxy
 neighbor 192.168.10.36 peer-group RP
 neighbor 192.168.10.37 peer-group RP
 neighbor 192.168.10.38 peer-group RP

 address-family ipv4 unicast
  redistribute connected
  !
  neighbor DNS activate
  neighbor DNS route-map ALLOW-ALL in
  neighbor DNS route-map ALLOW-ALL out
  neighbor DNS next-hop-self
  !
  neighbor RP activate
  neighbor RP route-map ALLOW-ALL in
  neighbor RP route-map ALLOW-ALL out
  neighbor RP next-hop-self
 exit-address-family
 !
route-map ALLOW-ALL permit 10
!
line vty
!

Now create and start the FRR container by executing the on-boot script. Subsequent reboots of the UDM will start the container, and this step will also check that execute permissions are set.

/mnt/data/on_boot.d/10-onboot-frr.sh

Verify that the BGP neighbors are up.

# podman exec frr vtysh -c 'show ip bgp'
BGP table version is 7, local router ID is 192.168.10.1, vrf id 0
Default local pref 100, local AS 65510
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

   Network          Next Hop            Metric LocPrf Weight Path
*> 180.150.52.0/22  0.0.0.0                  0         32768 ?
*> 192.168.8.0/24   0.0.0.0                  0         32768 ?
*> 192.168.9.9/32   192.168.10.31            0             0 65511 i
*                   192.168.10.32            0             0 65511 i
*                   192.168.10.33            0             0 65511 i
*> 192.168.9.10/32  192.168.10.36            0             0 65512 i
*                   192.168.10.37            0             0 65512 i
*                   192.168.10.38            0             0 65512 i
*> 192.168.10.0/24  0.0.0.0                  0         32768 ?
*> 192.168.12.0/24  0.0.0.0                  0         32768 ?

Displayed  6 routes and 10 total paths

And you’re done!

To verify use netstat. (And if you don't see the routes you might need to update, or use a replacement kernel as described below.)

# netstat -ar
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
180.150.52.0    *               255.255.252.0   U         0 0          0 eth8
192.168.8.0     *               255.255.255.0   U         0 0          0 br8
192.168.9.9     192.168.10.31   255.255.255.255 UGH       0 0          0 br0
192.168.9.10    192.168.10.36   255.255.255.255 UGH       0 0          0 br0
192.168.10.0    *               255.255.255.0   U         0 0          0 br0
192.168.12.0    *               255.255.255.0   U         0 0          0 br2

You can then modify your current DHCP/forwarders configuration to point to the new BGP controlled addresses, and test that the active neighbor transitions by rebooting the neighbor box that is the current route (indicated by a ">" symbol in the "show ip bgp" list) then re-running the show ip command after neighbor reboot.

A network does not need to be defined in the controller for 192.168.9.0/24, as routing will work regardless because a /32 address is used.

Sample neighbor configuration

I don't include detailed configuration notes for each of the UDM’s neighbors, as everyone’s flavour of *nix differs. I’m a Rocky Linux bloke, which is based on Red Hat Enterprise Linux. The FRR config should be cross-platform, but implementing a loopback adapter will definitely differ by platform.

Install FRR on the neighbor and enable the BGP daemon in the /etc/frr/daemons configuration file, then configure BGP and a loopback interface.

Here is a sample from one of my DNS servers bgpd.conf, with routing to the loopback interface “lo”:

!
! Zebra configuration saved from vty
!   2020/10/20 23:24:34
!
frr version 7.4
frr defaults traditional
!
log syslog informational
!ipv6 forwarding
!service integrated-vtysh-config
!
!
router bgp 65511
 bgp ebgp-requires-policy
 bgp router-id 192.168.10.31
 neighbor V4 peer-group
 neighbor V4 remote-as 65510
 neighbor 192.168.10.1 peer-group V4
 !
 address-family ipv4 unicast
  redistribute connected
  neighbor V4 route-map IMPORT in
  neighbor V4 route-map EXPORT out
 exit-address-family
 !
route-map EXPORT deny 100
!
route-map EXPORT permit 1
 match interface lo
 set origin igp
!
route-map IMPORT deny 1
!
line vty
!

And a sample network scripts loopback configuration. /etc/sysconfig/network-scripts/ifcfg-lobgp

Adjust to taste for your flavour of *nix.

# Loopback for BGP DNS AnyCast
DEVICE=lo:0
BOOTPROTO=none
BROADCAST=192.168.9.9
IPADDR=192.168.9.9
NETMASK=255.255.255.255
NETWORK=192.168.9.9
ONBOOT=yes

Conclusion

BGP is the routing protocol of the Internet, and using it locally in your own internal network for things like DNS and other stateless service redundancy is excellent, as fail-over is lightning fast, and maintenance can be usually done on individual nodes without causing any disruption.

So while nowhere near as simple to do as on previous Ubiquiti routers, being able to now get it running on the new Dream Machine, thanks to the work of some brilliant open source sharers is just fantastic.

Cheers.

Extra: Install a custom kernel on the UDM if required

A custom kernel may be required to enable BGP routes to be added to the UDM routing table, depending on your version. The kernel included with my UDM initially wouldn't do this, but as of my version (7.1.60) it does.

I visited https://github.com/fabianishere/udm-kernel-tools for details and the latest code.

These commands will install fabianishere's kernel tools.

unifi-os shell
cd /tmp
wget https://github.com/fabianishere/udm-kernel-tools/releases/download/v1.1.4/udm-kernel-tools_1.1.4_arm64.deb
apt -y install ./udm-kernel-tools_1.1.4_arm64.deb
(Note that the code blocks on my blog site hide the urls. They're there though. Refresh browser and you'll see them briefly, or copy the text to your clipboard... There's some wonky CSS that I need to fix. Soz.)

Download and install a replacement kernel. Check compatibility for your UDM firmware level. Details are at fabianishere's github.

wget https://github.com/fabianishere/udm-kernel/releases/download/v4.19.152-edge3/udm-kernel-4.19.152-edge3_4.19.152-edge3-1_arm64.deb
apt install ./udm-kernel-4.19.152-edge3_4.19.152-edge3-1_arm64.deb

udm-bootctl list
udm-bootctl boot 4.19.152-edge3

You will lose SSH for a while once you boot the replacement kernel. Don’t panic. You can reconnect in a little while after it boots. Verify that it worked once you're back in:

unifi-os shell
uname -a
> Linux ubnt 4.19.152-edge3 #1 SMP Wed Dec 22 09:14:38 UTC 2021 aarch64 GNU/Linux

The replacement kernel will not automatically load on reboot at this point, so to enable that:

udm-bootctl set-default 4.19.152-edge3
systemctl enable udm-autoboot.service

And to disable auto-boot if needed later:

systemctl disable udm-autoboot.service

The boot time for the UDM will increase, given it's switching kernels part way in, but it's not appreciably longer. And reboots happen most infrequently anyway.

At this point, exit the unifi-os shell using ctrl-D or exit.