Overview
This document describes how to create a fault tolerant load balancer cluster using LVS and Keepalived on Debian Lenny. LVS works off of internal firewall marks that are added to incoming packets by iptables. The example here uses firewall mark 100 to manage this internal routing for the load balanced IP address 192.168.1.100. Note that LVS uses decimal 100, where iptables uses the hex equivalent (0x64).
Prerequisites and Assumptions
- Debian Lenny - clean install
- Internal network: 192.168.1.0/24
- 192.168.1.250 - LVS cluster IP
- 192.168.1.251 - lvs1
- 192.168.1.252 - lvs2
- Crossover network: 172.16.1.0/24
- 172.16.1.251 - lvs1
- 172.16.1.252 - lvs2
- Load balanced IP: 192.168.1.100, across two servers
- web1 - 192.168.1.10
- web2 - 192.168.1.20
- Firewall mark 100 (hex 0x64) in iptables
Web Servers - web1 and web2
The following items need to be completed on web1 and web2 in order for the content on these servers to be load balanced.
Loopback Network Adapter
Traffic from the load balancers to the web servers is transported over layer 2, but the packet will be addressed to the load balanced IP address. So, this IP address needs to be bound to the loopback interface with a /32 subnet, and the network stack needs to be configured to allow the packet to route from the primary network interface to the loopback. The method of binding this IP is slightly different on any given platform, but here are some general tips.
Linux
Add the following to /etc/sysctl.conf in order to prevent the loopback from responding to ARP. Strange things can occur if this is not configured correctly.
net.ipv4.conf.lo.rp_filter = 0 net.ipv4.conf.all.arp_announce = 2 net.ipv4.conf.all.arp_ignore = 1
Make sure these changes are active with
sysctl -p
Windows 2003
Windows 2003 does not allow a /32 subnet using the network configuration interface (even though it's valid), so add the IP with a /24, and edit the registry key at
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\\{blah}\SubnetMask
Also, set the DNS server for the loopback by editing the following registry key
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\\{blah}\NameServer
Windows 2008
Windows 2008 does not allow traffic to be routed between network interfaces by default, so this needs to be configured per-interface.
-
Find the interface name from the command line
netsh interface show interface Admin State State Type Interface Name ------------------------------------------------------------------------- Enabled Connected Dedicated Loopback Adapter Disabled Disconnected Dedicated Local Area Connection 2 Enabled Connected Dedicated Local Area Connection
-
Configure the loopback and the primary network interface to allow inter-interface traffic
netsh interface ipv4 set interface "Loopback Adapter" weakhostreceive=enabled store=persistent netsh interface ipv4 set interface "Loopback Adapter" weakhostsend=enabled store=persistent netsh interface ipv4 set interface "Loopback Adapter" forwarding=enabled store=persistent netsh interface ipv4 set interface "Local Area Connection" weakhostreceive=enabled store=persistent netsh interface ipv4 set interface "Local Area Connection" weakhostsend=enabled store=persistent netsh interface ipv4 set interface "Local Area Connection" forwarding=enabled store=persistent
- The network interfaces may need to be restarted for these settings to take effect
Current network interface settings can be viewed with the following
netsh interface ipv4 show interface "Loopback Adapter" netsh interface ipv4 show interface "Local Area Connection"
Web Server Configuration
Because the web traffic is traversing layer 2 in order to reach the server, the web server software must be configured to listen on this address on the loopback interface.
Health Test
A test page of some sort must be configured on the web server so that the load balancers can determine the health of each server before sending traffic. This test page should be a small, lightweight page that tests any systems required for proper site functionality. Things like database connectivity are common. Whatever testing is done should result in a CONSTANT output on a successful test, and something different on a failure. The reason for this is that the load balancer does an MD5SUM on the output, and if the hash is different than expected, traffic is not sent to the server.
This example assumes that /lb.php is configured as the health test.
Ferm - lvs1 and lvs2
Ferm is a wrapper language around iptables, which makes creating new iptables rules nice and easy. This can also be done using iptables directly.
-
Install ferm
aptitude install ferm
-
Create the fwmark functions that will be used later for the load balanced IP addresses in /etc/ferm.conf
def &LVS_MARK($pub, $mark) = { table mangle chain PREROUTING daddr $pub proto tcp MARK set-mark $mark; }
-
Load the required kernel modules
modprobe iptable_mangle modprobe xt_multiport modprobe xt_MARK modprobe ip_vs modprobe ip_vs_rr modprobe ip_vs_nq modprobe ip_vs_wlc
-
Force these modules to load automatically at boot by adding the following lines to /etc/modules
iptable_mangle xt_multiport xt_MARK ip_vs ip_vs_rr ip_vs_nq ip_vs_wlc
-
Configure ferm to mark incoming packets with firewall mark 100, by adding the following to /etc/ferm/ferm.conf
&LVS_MARK(192.168.1.100, 0x0064);
-
Reload ferm to activate this new rule
/etc/init.d/ferm reload
The new rule should be visible in the current iptables rules now
# iptables -t mangle -L PREROUTING Chain PREROUTING (policy ACCEPT) target prot opt source destination MARK tcp -- anywhere 192.168.1.100 tcp MARK xset 0x64/0xffffffff
Keepalived and LVS - lvs1 and lvs2
-
Install keepalived. LVS is a dependency of this package, and will be installed automatically.
aptitude install keepalived
-
Edit /etc/keepalived/keepalived.conf, router_id needs to be unique on each server, the rest should be identical.
global_defs { router_id LVS1 } vrrp_instance vi_100 { interface eth1 lvs_sync_daemon_interface eth1 dont_track_primary track_interface { eth0 } state BACKUP priority 90 nopreempt virtual_router_id 100 garp_master_delay 1 authentication { auth_type PASS auth_pass $up3r$3(r37 } virtual_ipaddress { 192.168.1.250 dev eth0 } virtual_ipaddress_excluded { 192.168.1.100 dev eth0 } } include /etc/keepalived/conf.d/*.conf
-
Find the hash of the health test page. This will be used in the next step
genhash -s 192.168.1.10 -p 80 -u /lb.php MD5SUM = 75ad4ab44c3ad9f892776b2487173724
-
Create a new configuration file for this firewall mark at /etc/keepalived/conf.d/fwm-100.conf. Use the MD5SUM from above for the digest entries here.
virtual_server fwmark 100 { delay_loop 30 lb_algo wlc lb_kind DR protocol TCP ! web1 real_server 192.168.1.10 0 { weight 50 HTTP_GET { url { path /lb.php digest f1970750b613075f64a75d0623bdda26 } connect_port 80 connect_timeout 5 nb_get_retry 3 delay_before_retry 3 } } ! web2 real_server 192.168.1.20 0 { weight 50 HTTP_GET { url { path /lb.php digest f1970750b613075f64a75d0623bdda26 } connect_port 80 connect_timeout 5 nb_get_retry 3 delay_before_retry 3 } } }
-
Start keepalived on both lvs1 and lvs2
/etc/init.d/keepalived start
192.168.1.100 and 192.168.1.250 should now be bound to one of the servers on eth0, check this with
# ip addr show dev eth0 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:7d:af:a3 brd ff:ff:ff:ff:ff:ff inet 192.168.1.251/24 brd 192.168.1.255 scope global eth0 inet 192.168.1.250/32 scope global eth0 inet 192.168.1.100/32 scope global eth0 inet6 fe80::20c:29ff:fe7d:afa3/64 scope link valid_lft forever preferred_lft forever
And, both web1 and web2 should be listed in the firewall status
# ipvsadm IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn FWM 100 wlc persistent 3600 -> 192.168.1.10:0 Route 50 0 0 -> 192.168.1.20:0 Route 50 0 0
Other Considerations
- If the application being load balanced does not allow sharing session state between the servers, a persistence timeout is likely required. This forces any traffic from a given source IP address to be sent to the same server until there has been more idle time from that IP than the configured timeout (in seconds). This configuration goes in the virtual_server configuration in /etc/keepalived/conf.d/fwm-100.conf
virtual_server fwmark 100 { delay_loop 30 lb_algo wlc lb_kind DR protocol TCP persistence_timeout 3600 [...snip...]
-
If your web servers are making use of name-based virtual hosts, and separate load balancer groups are required for each site, a virtualhost entry needs to be configured so that the test page is being processed by the correct site. This also goes in the virtual_server configuration in /etc/keepalived/conf.d/fwm-100.conf
virtual_server fwmark 100 { delay_loop 30 lb_algo wlc lb_kind DR protocol TCP virtualhost www.example.com [...snip...]
-
A "sorry server" can be configured to handle traffic in the event that all web servers are failing the health test. This server should have a similar network configuration as the real web servers, and should return some simple "sorry" page for any URL it receives. This configuration also goes in the virtual_server configuration in /etc/keepalived/conf.d/fwm-100.conf
virtual_server fwmark 100 { delay_loop 30 lb_algo wlc lb_kind DR protocol TCP sorry_server 192.168.1.30 0 [...snip...]
This sorry server functionality can be handled by the LVS servers themselves, as long as the content served is extremely lightweight. To do so, bind the load balancer IP to the loopback and suppress ARP just like on web1 and web2. Then, set the sorry server to
sorry_server 127.0.0.1 0
- Any changes to the ferm or keepalived configuration need to be propagated to both servers. The only difference should be the router_id in /etc/keepalived/keepalived.conf. A simple script should be enough to keep these in sync, just be sure to reload both services after any change.