Bluhalo IT


How to quickly setup a load balanced, high availability, Apache cluster

Posted in Apache, linux by Simon Green on April 29, 2009

I wanted to find a simple to maintain and expand soultion for load balancing a web cluster with high availability. I have found my solution in HAProxy.

Scenario

Bluhalo Labs HAProxy Test Rig

Bluhalo Labs HAProxy Test Rig

This demo scenario is in the following enviroment:

  • Network Configuration
    • Network: 192.168.11.0
    • Subnet Mask: 255.255.255.0
    • Gateway: 192.168.11.254 (Although this is irrelavant)
  • 1 x Shared IP for the Load Balancers
    • 192.168.11.40
  • 2 x Load Balancers
    • BHLabs1 – 192.168.11.30
    • BHLabs6 – 192.168.11.39
  • 4 x Web Servers
    • BHLabs2 – 192.168.11.35
    • BHLabs3 – 192.168.11.36
    • BHLabs4 – 192.168.11.37
    • BHLabs5 – 192.168.11.38

At the start of this setup all machines are running Ubuntu 8.04 Server from a standard install with openssh-server installed and the root password set. All setup commands are run as root or with sudo.

Web Server Configuration

Basic Config

As we are only doing this as a basic test, a very simple Apache config is required.
This will install Apache 2 and also PHP5 to give us some basic scripting to output server name for testing etc that you may wish to play with later.

# apt-get -y install php5

Next you need to create a check file for HAProxy to look for from the load balancers. This file will be used to determine if the servers are up. This will create a blank file called check.txt in the default DocumentRoot for Apache.

# touch /var/www/check.txt

Now stick your test index.html in that directory as well.

# echo "oh hi" > /var/www/index.html

Log Modification

Filtering out HAProxy health checks

You don’t want to log hits to the check.txt file in your Apache logs, so put an exclusion in your VirtualHost directive. here’s an example of how:

<VirtualHost *>
  ServerAdmin server-alert@bluhalo.com
  DocumentRoot /var/www/
  ErrorLog /var/log/apache2/error.log
  LogLevel warn
  CustomLog /var/log/apache2/access.log combined env=!dontlog
  SetEnvIf Request_URI "^/check\.txt$" dontlog
</VirtualHost>

Modify the access logs

HAProxy will act as a completely transparent proxy so by default the web servers will log the load balancers IP in it’s logs instead of the user’s. HAProxy add’s the user’s IP to the header in the “X-Forwarded-For” field, so you need to modify the log configuration in your apache2.conf to take advantage of this:

vi /etc/apache2/apache2.conf

Search for entries that start “LogFormat” …

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent

… and swap “%h” for “%{X-Forwarded-For}i” like:

LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent

Load Balancer Configuration

You need to install HAProxy and Heartbeat for this setup to work. HAProxy provides your load balancing functionality and Heartbeat provides your high-availability failover functionality.

# apt-get -y install haproxy heartbeat-2

Let’s start with HAProxy as thats the easier oneĀ :)

HAProxy

Open up the HAProxy config file:

# vi /etc/haproxy.cfg

and replace the whole file on both servers with the following:

global
  log 127.0.0.1 local0
  log 127.0.0.1 local1 notice
  maxconn 4096
  #debug
  #quiet
  user haproxy
  group haproxy

defaults
  log global
  mode  http
  option  httplog
  option  dontlognull
  retries 3
  redispatch
  maxconn 2000
  contimeout  5000
  clitimeout  50000
  srvtimeout  50000

listen bhlabslb 192.168.11.40:80
  mode http
  stats enable
  stats auth admin:password
  balance roundrobin
  option httpclose
  option forwardfor
  option httpchk HEAD /check.txt HTTP/1.0
  server  inst1 192.168.11.35:80 cookie server01 check inter 2000 fall 3
  server  inst2 192.168.11.36:80 cookie server02 check inter 2000 fall 3
  server  inst3 192.168.11.37:80 cookie server01 check inter 2000 fall 3
  server  inst4 192.168.11.38:80 cookie server02 check inter 2000 fall 3
  capture cookie vgnvisitor= len 32

  rspidel ^Set-cookie:\ IP= # do not let this cookie tell our internal IP address

You also have to allow the HAProxy service to start. Change the ENABLED value in the /etc/default/haproxy:

# Set ENABLED to 1 if you want the init script to start haproxy.
ENABLED=1
# Add extra flags here.
#EXTRAOPTS="-de -m 16"

Heartbeat

We will be using Heartbeat to pass the shared IP address (192.168.11.40) between our 2 load balancers if one goes down. To do this, it needs to be able to bind to an address that doesn’t yet exists on the system. In order to allow this you need to add the following to /etc/sysctl.conf:

# Allow HAProxy shared IP
net.ipv4.ip_nonlocal_bind = 1

and then run:

# sysctl -p

Heartbeat requires 3 main configuration files which do not come with the install. First of all the authkey. Do the following on both servers:

# vi /etc/ha.d/authkeys

add the following content, making sure you replace MyPassword with a secure string. This needs to be the same on both servers:

auth 3
3 md5 MyPassword

This file MUST be accessible only by root or Heartbeat won’t start:

# chmod 600 /etc/ha.d/authkeys

Next on each server create the following. Run:

# uname -n

to get the kernels take on the local hostname, and then insert this into:

# vi /etc/ha.d/haresources

in the following syntax:

BHLabs1 192.168.11.40

and

BHLabs6 192.168.11.40

Note the hostname changes but not the IP. The IP is the shared IP. Finally the main Heartbeat config file:

vi /etc/ha.d/ha.cf

On the first server:

#
#       keepalive: how many seconds between heartbeats
#
keepalive 2
#
#       deadtime: seconds-to-declare-host-dead
#
deadtime 10
#
#       What UDP port to use for udp or ppp-udp communication?
#
udpport        694
bcast  eth0
mcast eth0 225.0.0.1 694 1 0
ucast eth0 192.168.11.30
#       What interfaces to heartbeat over?
udp     eth0
#
#       Facility to use for syslog()/logger (alternative to log/debugfile)
#
logfacility     local0
#
#       Tell what machines are in the cluster
#       node    nodename ...    -- must match uname -n
node    BHLabs1
node    BHLabs6

and on the second server:

#
#       keepalive: how many seconds between heartbeats
#
keepalive 2
#
#       deadtime: seconds-to-declare-host-dead
#
deadtime 10
#
#       What UDP port to use for udp or ppp-udp communication?
#
udpport        694
bcast  eth0
mcast eth0 225.0.0.1 694 1 0
ucast eth0 192.168.11.39
#       What interfaces to heartbeat over?
udp     eth0
#
#       Facility to use for syslog()/logger (alternative to log/debugfile)
#
logfacility     local0
#
#       Tell what machines are in the cluster
#       node    nodename ...    -- must match uname -n
node    BHLabs1
node    BHLabs6

Testing

Restart all the services. On the web servers run:

# apache2ctl restart

and on the load balancers:

# /etc/init.d/heartbeat start
# /etc/init.d/haproxy start

You should then be able to hit http://192.168.11.40 and see your test webpage!
You should also have a page full of stats to please they eyes from HAProxy at http://192.168.11.40:81/haproxy?stats. This can be turned off by removing the following 2 lines from the haproxy.cfg. This should be removed in a production enviroment.:

 stats enable
 stats auth admin:password

Try the following:

  • Kill the heartbeat service on each load balancer in turn while running “watch ifconfig”. You should see the IP address move from server to server.
  • Pull the plug on either of the load balancers while running a ping from your PC to 192.168.11.40, the ping should never fail.
  • Shut down Apache on any of the web servers, you should see them go red in the stats page within 2 seconds and they will be removed from the load balanced group until they come back up.

Caveats and best practices

This setup is lacking in some important best practices:

  1. The heartbeat should run on it’s own network, usually a VLAN dedicated to it.
  2. The web servers should sit on a different network to the frontend on the load balancers. The load balancers would then have a second interface out to that network to connect to the web servers.

Later I will follow this up with a way to add MySQL to this configuration.

Reference

3 Responses to 'How to quickly setup a load balanced, high availability, Apache cluster'

Subscribe to comments with RSS or TrackBack to 'How to quickly setup a load balanced, high availability, Apache cluster'.


  1. [...] post: How to fast setup a bucket balanced, tall availability, Apache cluster Tags: apache-nbspcluster, apple, basic-config, bluhalo-it, cloud-computing, Dedicated servers, [...]

  2. Joe M said,

    on the subject of load balancing, the LoadMaster 2000 intelligently and efficiently distributes user traffic among web and application servers to ensure users get the best experience possible.


  3. [...] post is a follow on from my previous post on how to setup a load balanced high availability Apache cluster and uses the same network setup [...]


Leave a Reply