High Availability Firewall
This package contains a set of scripts to configure a high-availability firewall. Configured with keepalived, it will provide a failover mechanism between two nodes.
Requirements:
- Two nodes must have the same network devices
- Nodes must be connected to the same LAN
Limitations:
- LAN name must be ‘lan’ in both firewalls
- IPv4 only
- VLANs are supported only on physical interfaces
- Extra packages such as NUT are not supported
- rsyslog configuration is not synced: if you need to send logs to a remote server, you must use the controller
- After the first synchronization, the backup node will have the same hostname as the primary node
The following features are supported:
- Firewall rules, including port forwarding
- DHCP and DNS server
- SSH server (dropbear)
- OpenVPN RoadWarrior and tunnels
- IPsec tunnels (strongwan)
- WireGuard tunnels
- Static routes
- QoS (qosify)
- Multi-WAN (mwan3)
- DPI rules
- Netifyd informatics configuration
- Threat shield IP (banip)
- Threat shield DNS (adblock)
- Reverse proxy (nginx)
- ACME certificates
- Users and objects database
- Netmap
- Flashstart
- SNMP server (snmpd)
- NAT helpers
- Dynamic DNS (ddns)
- SMTP client (msmtp)
- Backup encryption password
- Controller connection and subscription (ns-plug)
- Active connections tracking (conntrackd)
- Dedalo hotspot
Configuration
The setup process configures the following:
- check if requirements are met both on the primary and backup nodes
- configures HA traffic on lan interface
- sets up keepalived with the virtual IP, a random password and a public key for the synchronization
- configures dropbear to listen on port
65022
: this is used to sync data between the nodes using rsync, only key-based authentication is allowed - configures conntrackd to sync the connection tracking table
In this example:
primary_node_ip
is the primary node, with LAN IP192.168.100.238
backup_node_ip
is the backup node, with LAN IP192.168.100.239
- the virtual IP is
192.168.100.240
Before starting, follow these steps:
- power on the backup node, access the web interface and set a static LAN IP address, in this example
192.168.100.239
: - then, power on the primary node, access the web interface and set a static LAN IP address, in this example
192.168.100.238
These IP addresses are used to access the nodes directly, even if the HA cluster is disabled. You can consider these IP addresses as management IP addresses.
When the HA cluster is enabled, all the configuration will be automatically synchronized to the backup node, except for the network configuration. If you need to change the network configuration, do it on the primary node then follow the instructions below to adapt the HA configuration to the new network configuration.
The package provides a script to ease the configuration of the HA cluster, without accessing directly the APIs.
The script is named ns-ha-config
. Usage syntax is:
ns-ha-config <action> [<option1> <option2>]
Check local requirements
First, check the status of the primary node:
ns-ha-config check-primary-node [lan_interface] [wan_interface]
If the lan_interface
and wan_interface
are not specified, the script will use the default values:
lan
for the LAN interfacewan
for the first WAN interface
It will check the following:
- the LAN interface must be configured with a static IP address
- there is at least one WAN interface
- the WAN interface is not configured as a PPPoE connection
- if a DHCP server is running
- the
Force DHCP server start
option must be enabled - the DHCP option
3: router
must be set and configured with the virtual IP address (e.g.192.168.100.240
) - the DHCP option
6: DNS server
must be set; you can set it to the virtual IP address or to the DNS server of your choice: just make sure that the DNS server is reachable from the clients even if the primary node is down
- the
Hotspot is supported, but with the following requirements:
- the backup node must have the exact same network devices as the primary node
As an example, if the primary node has a VLAN interface named
vlan10
, the backup node must have the same VLAN interface with the same name. Otherwise, the hotspot will not work after a switchover. - the hostspot can run only on a physical interface or on a VLAN interface
To ensure hotspot functionality, the MAC address of the interface on the master node where the hotspot is configured will be copied to the corresponding interface on the backup node during failover. Also note that active sessions, which are saved in RAM, will be lost during a switchover, so the clients will need to re-authenticate if auto-login is disabled.
Check remote requirements
Then, check the status of the backup node:
ns-ha-config check-backup-node <backup_node_ip> [lan_interface]
It will check the following:
- the backup node must be reachable via SSH on port 22 with root user
- the LAN interface must be configured with a static IP address
- there is at least one WAN interface
Note the WAN interface is not checked on the backup node. In case of a switchover, the backup node will take over the WAN interface of the primary node but if there is no WAN interface configured on the backup node with the same name, the UI will show an unknown device.
Execute:
ns-ha-config check-backup-node <backup_node_ip>
The script will require to enter the password of the root user for the backup node.
You can also pass the SSH directly to standard input:
echo "password" | ns-ha-config check-backup-node <backup_node_ip>
Example with interactive password:
ns-ha-config check-backup-node 192.168.100.239
Example with password on standard input:
echo Nethesis,1234 | ns-ha-config check-backup-node 192.168.100.239
Initlialize the primary node
If the requirements are met, you can initialize the primary node, please note that the Virtual IP (only) must be written in CIDR notation.
ns-ha-config init-primary-node <primary_node_ip> <backup_node_ip> <virtual_ip> [lan_interface] [wan_interface]
The script will:
- initialize keepalived with the virtual IP
- configure conntrackd
- generate a random password and public key for the synchronization
- configure dropbear to listen on port
65022
and allow only key-based authentication
Example:
ns-ha-config init-primary-node 192.168.100.238 192.168.100.239 192.168.100.240/24
Initialize the backup node
If the requirements are met, you can initialize the backup node:
ns-ha-config init-backup-node
The script will ask for the password of the root user for the backup node.
You can also pass the SSH directly to standard input:
echo "password" | ns-ha-config init-backup-node
Example with password on standard input:
echo Nethesis,1234 | ns-ha-config init-backup-node
At this point, the primary node and the backup node are configured to talk to each other using the LAN interface. The virtual IP of the LAN will switch between the two nodes in case of failure.
It’s now time to configure additional interfaces, starting at least with the WAN interface.
Configure the WAN interface
The WAN interface must be configured on both nodes. Use the following command to add a WAN interface:
ns-ha-config add-wan-interface <interface> <virtual_ip_address> <gateway>
Make sure to:
- enter the virtual IP address in CIDR notation
- enter the gateway IP address of the WAN interface
The script will:
- create the network interface and devices in the backup node
- configure the interface on both nodes by using fake IP addresses from the fake network
169.254.X.0/16
- configure the virtual IP address on both nodes
Example:
ns-ha-config add-wan-interface wan 192.168.122.49/24 192.168.122.1
Configure LAN interfaces
Extra LAN interfaces can be added to the HA configuration only if they are already configured both on the primary and backup nodes with static IP addresses. Just like the main LAN interface.
Use this command also to add other local interfaces, such as guest ot DMZ interfaces.
You can add extra interfaces using the same command:
ns-ha-config add-lan-interface <primary_node_ip> <backup_node_ip> <virtual_ip_address>
When adding a LAN interface, the following requirements must be met:
- the LAN interface must be configured with a static IP address on both nodes
- if a DHCP server is running
- the
Force DHCP server start
option must be enabled - the DHCP option
3: router
must be set and configured with the virtual IP address (e.g.192.168.100.240
) - the DHCP option
6: DNS server
must be set; you can set it to the virtual IP address or to the DNS server of your choice: just make sure that the DNS server is reachable from the clients even if the primary node is down
- the
Example:
ns-ha-config add-lan-interface 192.168.200.185 192.168.200.186 192.168.200.190/24
Remove an interface
To remove an interface from the HA configuration, use the following command:
ns-ha-config remove-interface <interface>
Example:
ns-ha-config remove-interface wan
The script will:
- check if the given interface is already configured as HA interface
- remove the interface from keepalived configuration
- remove all virtual routes, if present
- remove the interface from the backup node
- move the virtual IP address to the original interface
Configue an alias
Aliases are special configurations that must explicitly set on the primary node. First, add the alias to the network interface using the web interface. Then, you can add the alias to the HA configuration.
To add an alias, use the following command:
ns-ha-config add-alias <interface> <alias> <ip_address> [<gateway>]
If the alias is for a WAN interface, you must enter also the gateway IP address.
The script will:
- check if the given interface is already configured as HA interface
- add the alias to keepalived configuration
Example:
ns-ha-config add-alias lan 192.168.100.66/24
Example for WAN interface:
ns-ha-config add-alias wan 192.168.122.66/24 192.168.122.1
NOTE: the alias will not appear in the network configuration of the backup node.
Remove an alias
To remove an alias, use the following command:
ns-ha-config remove-alias <interface> <alias>
The script will:
- remove the alias from keepalived configuration
- remove all virtual routes, if present
Example:
ns-ha-config remove-alias wan 192.168.122.66/24
Show current configuration
You can show the current configuration of the HA cluster:
ns-ha-config show-config
It will output something like this:
Current configuration
Interfaces:
Interface: lan, Device: br-lan, Virtual IP: 192.168.100.240/24
Aliases:
Interface: lan, Virtual Alias IP: 192.168.100.66/24
-----------------------------------------------------------------
Not configured
Interfaces:
Interface: wan, Device: eth1
Aliases:
Interface: wan, IP: 192.168.122.66/24
Check the status
You can check the status of the HA cluster at any time. Just execute:
ns-ha-config status
Just after the initialization, the script will return something like this:
Status: enabled
Role: primary
Current State: master
Last Sync Status: SSH Connection Failed
Last Sync Time: Fri Apr 18 13:07:08 UTC 2025
The first synchronization will take up to 10 minutes and will be done in the background. After few minutes, the status should be like this:
Status: enabled
Role: primary
Current State: master
Last Sync Status: Successful
Last Sync Time: Mon Jun 9 07:21:15 UTC 2025
Virtual IPs:
lan_ipaddress: 192.168.100.240/24 (br-lan)
wan_ipaddress: 192.168.122.49/24 (eth1)
Keepalived Statistics:
advert_rcvd: 0
advert_sent: 1730
become_master: 1
release_master: 0
packet_len_err: 0
advert_interval_err: 0
ip_ttl_err: 0
invalid_type_rcvd: 0
addr_list_err: 0
invalid_authtype: 0
authtype_mismatch: 0
auth_failure: 0
pri_zero_rcvd: 0
pri_zero_sent: 0
Alerting
The cluster sends alerts only if the machine has a valid subscription.
Available alerts are:
ha:sync:failed
: raised if the file synchronization fails; it usyally means that the backup node is not reachable. This alerts is raised only on the primary node.ha:primary:failed
: raised if the primary node is down; it means that there was a switchover. This alerts is raised with FAILURE state on the backup node when it takes over the virtual IP address; the alert is raised with OK state on the primary node when it comes back online.
Connecting to the backup node
Since the backup node does not have access to the Internet, you have 2 different ways to connect to it:
- directly using the static LAN IP address configured at the beginning
- from the primary node using SSH
To connect to the backup node from the primary, use the following command:
ns-ha-config ssh-remote
The scripts uses special SSH port 65022 and keepalived SSH private key: it is meant to be used on the primary node when the HA cluster is already configured.
Upgrading the backup node
The backup node does not have access to the Internet, so you need to upgrade it manually using an image file.
From the primary node, use the following command:
ns-ha-config upgrade-remote [<image>]
If image
is not specified, the script will download the latest image and install it on the backup node.
If image
is specified, the script will use the given image file to upgrade the backup node.
Troubleshooting and logs
Since the name of the backup host is replaced with the name of the primary host, it’s hard to distinguish between the two nodes when connecting via SSH. To avoid confusion, when the HA cluster is enabled, the bash prompt will show the keepalived status using:
P
for primary nodeS
for secondary (or backup) node
Prompt example for primary node:
root@NethSec [P]:~#
Prompt example for secondary node:
root@NethSec [S]:~#
A normal configuration synchronization will look like this on the secondary node:
Apr 23 09:48:49 NethSec dropbear[8098]: Child connection from 192.168.100.238:37350
Apr 23 09:48:49 NethSec dropbear[8098]: Pubkey auth succeeded for 'root' with ssh-rsa key SHA256:LDIBFC6gFHmIAUqdEWVi62ca/EUxZI7/08m2d76/hcQ from 192.168.100.238:37350
Apr 23 09:48:49 NethSec dropbear[8098]: Exit (root) from <192.168.100.238:37350>: Exited normally
Apr 23 09:48:49 NethSec dropbear[8100]: Child connection from 192.168.100.238:37356
Apr 23 09:48:49 NethSec dropbear[8100]: Pubkey auth succeeded for 'root' with ssh-rsa key SHA256:LDIBFC6gFHmIAUqdEWVi62ca/EUxZI7/08m2d76/hcQ from 192.168.100.238:37356
Apr 23 09:48:49 NethSec sudo: root : PWD=/root ; USER=root ; COMMAND=/usr/bin/rsync --server -nlogDtprRe.iLfxCIvu --log-format=X . /usr/share/keepalived/rsync
Apr 23 09:48:49 NethSec dropbear[8100]: Exit (root) from <192.168.100.238:37356>: Exited normally
All sync events are logged in the /var/log/messages
file, you can filter them using the following command:
grep ns-rsync.sh /var/log/messages
When a new interface has been added to the HA configuration, the backup node will log it inside /var/log/messages
file.
The log will look like this:
Apr 23 06:51:38 NethSec ns-ha: Importing network configuration: {"device": "eth1", "proto": "dhcp", "record_type": "interface", "record_id": "wan"}
To see active keepalived configuration, execute:
cat /tmp/keepalived.conf
Debugging
The ns-ha configuration script is a shell script that can be debugged using the -x
option.
Example:
bash -x ns-ha-config <action> [<option1> <option2>]
It’s also possible to enable debugging for the keepalived service. To enable it, execute on the primary node:
uci set keepalived.primary.debug=1
uci commit keepalived
reload_config
Then, search for Keepalived_vrrp
in the /var/log/messages
file.
Maintenance
The HA cluster can be disabled at any time. But be careful: if you disable the primary node first, the backup node will take over the virtual IP address.
The static LAN IPs configured at the beginning can be considered management IPs. These IPs are always accessible and can be used to manage the nodes directly, regardless of the HA cluster status.
Maintance of the backup node
To disable the HA cluster, use the following command on the backup node:
/etc/init.d/keepalived stop
Proceed with the primarytenance of the backup node, then re-enable the HA cluster:
/etc/init.d/keepalived start
Maintenance of the primary node
When the primary node is disabled, the backup node will take over the virtual IP address. To disable the HA cluster, use the following command on the primary node:
/etc/init.d/keepalived stop
Proceed with the primarytenance of the primary node, then re-enable the HA cluster:
/etc/init.d/keepalived start
The primary node will take over the virtual IP address again.
Reset the configuration
To reset the configuration, use the following command:
ns-ha-config reset
The script will:
- stop and disable keepalived
- stop and disable conntrackd
- remove the configuration files
- cleanup dropbear configuration including the SSH keys
The script will not change the network configuration of the nodes. You can access them using the static LAN IP addresses configured at the beginning and manage them as standalone nodes.
How it works
The HA cluster consists of two nodes: one is the primary and the other is the backup. All configurations must be always done on the primary node. The configuration is then automatically synchronized to the backup node.
Keepalived runs a specially crafted rsync script (/etc/keepalived/scripts/ns-rsync.sh
) on the primary node to:
- export WireGuard interfaces, IPsec interfaces, routes and hotspot mac address to
/etc/ha
- synchronize all files listed by
sysupgrade -l
and custom files added with theadd_sync_file
option from scripts inside/etc/hotplug.d/keepalived
directory; files are synchronized to the backup node inside the directory/usr/share/keepalived/rsync/
The hotplug keepalived
event is used to inform the system about changes in the keepalived status.
The event is triggered with an ACTION
parameter that can be:
-
NOTIFY_SYNC
: the script is executed on the backup node after a sync has been done and a listed file is changed During this phase, all directories (like/etc/openvpn
and/etc/ha
) are synched to the original position. Also WireGuard interfaces, IPsec interfaces and routes are imported from the/etc/ha
directory but in disabled state. NOTIFY_MASTER
: the script can be executed both on the primary and on the backup node:- on the primary node, after keepalived is started: this is the normal startup state
- on the backup node, after a switchover has been done: this is the failover state;
all WireGuard interfaces, IPsec interfaces and routes previously imported from the
/etc/ha
are enabled if they were enabled on the primary node
NOTIFY_BACKUP
: the script is executed on the backup node, after keepalived is started or if the primary returns up after a downtime All non-required services are disabled, including WireGuard interfaces, IPsec interfaces and routes.
The backup node keeps the configuration in sync with the primary node, but most services, including crontabs, are disabled. The following cronjobs are disabled on the backup node and enabled on the primary node:
- subscription heartbeat
- subscription inventory
- phonehome
- remote reports to the controller
- remote backup
Network configuration fundamentals
Each network interface managed by the High Availability (HA) system must have a static IP address. WAN interfaces and LAN interfacres are configured in different ways:
- a WAN interface is configured automatically, it will be assigned an IP address in the
169.254.X.0/24
range. For every WAN interface, a new169.254.X.0
network will be allocated. The primary node will get the IP address169.254.X.1
and the backup node will get the IP address169.254.X.2
. This imposes a theoretical limit of 254 WAN interfaces that can be managed by the HA system. - a LAN interfaces do not use the 169.254.X.0/24 network. It must be configured manually with a static IP address on both nodes, then it will be assigned a Virtual IP address. The virtual IP must be in the same subnet as the LAN interface IP address. The static configuration is required to ensure that dnsmasq can start correctly: it requires a static IP address on the interface in the range of DHCP range, this is a limitation of OpenWrt implementation. The network interface will then be accessible using the Virtual IP address configured in the HA system. All clients must use the Virtual IP address to access the firewall services.
Note that the backup node does not have access to Internet so:
- it will not be able to resolve DNS names
- it will not be able to reach the Controller nor Nethesis portals
- it will not receive updates