Load Balancing FAQ
Generally, load
balancing is any method for evenly distributing processing or
service requests across devices in a network. We are talking about
server and network load balancing here.
Preface:
This page is heavily borrowed from other sites including LVS / LVSKB /
HAProxy / Loadbalancer.org (our main site) etc. I aim to give it more original content
and structure ASAP... honest :-). Mainly created because our primary
site doesn't get anywhere in the search results for 'load balancing',
which we think it should!
We will try to give a lot of links to
external sites and also our competitors so that this is not a
complete waste of your precious browsing time.
One of the common
problems with IT is the horrendous abuse of terminology by marketing
types using terms like ADC, ADN... and on and on...
We'd like to use this page to clear up a few terms using our own
perspective from 8 years as a load balancing appliance vendor.
If you think anything is missing (no really?), drop us an email.... ( support@loadbalancer.org )
Layer-2 Load Balancing (bonding)
Layer-4 Load Balancing
Layer-7 Load Balancing (reverse proxy)
SSL Termination
Hardware SSL acceleration or offload
Persistence / Sticky/Affinity
Server health checking
DNS Load Balancing
Link Load Balancing
Load Balancing Optimization / Compression
WAN Load Balancing Optimization / Compression
SIP Load Balancing
Computing Load Balancing
Free BSD stuff CARP, PF and hoststated
Load Balancing Appliance vendors
Layer-2 Load Balancing (bonding)
Layer-2 load balancing, aka link aggregation,
port aggregation,
ether channel, or gigabit ether channel port bundling is to bond two or
more links into a single, higher-bandwidth logical link. Aggregated
links also provide redundancy and fault tolerance if each of the
aggregated links follows a different physical path. Link aggregation
may be used to improve access to public networks by aggregating modem
links or digital lines. Link aggregation may also be used in the
enterprise network to build multi-gigabit backbone links between gigabit
ethernet switches. See also NIC teaming or Link Aggregation Control
Protocol (LACP).
The Linux kernel has the Linux bonding driver, which can aggregate
multiple links for higher throughput or fault tolerance.
Our Opinion: The Linux Bonding driver works really well
in master/slave mode without any changes to your infrastructure. If you
have a trunk configured on your switches then you can use full 802.3ad
LACP.
Layer-4 Load Balancing
Layer-4 load balancing is to distribute requests to the servers at
transport layer, such as TCP, UDP and SCTP transport protocol. The load balancer
distributes network connections from clients who know a single IP
address for a service, to a set of servers that actually perform the
work. Since connection must be established between client and server in
connection-oriented transport before sending the request content, the
load balancer usually selects a server without looking at the content
of the request.
IPVS / LVS is an implementation of layer-4 load balancing for the Linux
kernel,
and has been ported to FreeBSD recently. Loadbalancer.org, Kemp
Technologies & Barracuda et al. use IPVS extensively in their
hardware
load balancers.
Layer-4 load balancing can also be used to balance traffic at
multiple Internet access links, in order to increase Internet access
speed. See DSL load balancing for more
information. SmoothWall, FatPipe, Xrio et al. provide appliances to do
this.
Our Opinion: IPVS
aka. LVS is awesome, a fast reliable open source load balancing solution
best combines with HA-Linux (Heartbeat), Keepalive or Ultramonkey /
Ldirectord.
Layer-7 Load Balancing (reverse proxy)
Layer-7 load balancing, also known as application-level load
balancing,
is to parse requests in application layer and distribute requests to
servers based on different types of request contents, so that it can
provide quality of service requirements for different types of contents
and improve overall cluster performance. The overhead of parsing
requests in application layer is high, thus its scalability is limited,
compared to layer-4 load balancing.
KTCPVS
is an implementation of layer-7 load balancing for the Linux kernel.
With the appropriate modules, the Apache, Lighttpd and nginx web
servers can also provide layer-7 load balancing as a reverse proxy.
Lots of commercial vendors use Layer 7 load balancing for cookie
insertion etc. Barracuda do cookie insertion
OK... Loadbalancer.org and Kemp do a nice extra which is Terminal Server RDP
cookies....BUT for real flexibility F5
and Citrix netscaler dominate the Layer 7 Load Balancing market, F5
like to call it ADC Application Delivery Controller or ADN Application
Delivery Network... we prefer the honest term of proxy or reverse proxy
but that's not so sexy is it?
Our Opinion: KTCPVS
doesn't seem as mature as HAProxy and it looks like the best
features of kernel splicing etc. are being integrated into HAProxy as
well. Exceliance and Loadbalancer.org are working with the community to ensure RDP cookies, source IP
persistence and keepalive are integrated into the open source HAProxy
solution so that it can give the big boys a run for their money.
UPDATE: Hey its all finished and its juicylicious in HAProxy 1.4.2...
SSL Termination
SSL Termination is the ability for a load balancer to establish a
secure tunnel with the client thus in most cases replacing the
requirement for the web server to perform SSL. In order for the load
balancer to perform this function it must be configured with an SSL
certificate either self generated or signed by a certificate authority.
SSL termination is often required for any Layer 7 trickery such as
cookie insertion etc. otherwise the load balancer can't read the
encrypted payload of the packets. Layer 4 load balancing doesn't have
the need to read the packet contents and therefore doesn't require SSL
Termination.
Our Opinion: SSL Termination puts a heavy processing
load on your load balancing appliance, why not spread the SSL
termination load across your cluster for better scalability? Obviously you have to use it if you want
to use Layer 7 functionality on SSL traffic. BTW: A basic Celeron CPU
processor can do 700 TPS these days.
"Concerning the CPU intensive tasks (compression, SSL, ...),
I find it very important to explain that once the device is
saturated, it's the end and you will never scale anymore. Also,
explaining that a $100k device can see its performance divided
by 10 or 100 just to save some configuration on backend servers
is stupid." - Willy Tarreau (Author of HAProxy)
Hardware SSL acceleration or
offload
Hardware SSL acceleration or offload means that a special hardware
chipset is used to handle the CPU intensive process of handling SSL
termination. Modern hardware acceleration cards can handle 10,000 TPS +
(termination per second).
Our Opinion: Commonly abused
term by vendors check the TPS rating! Not as important as it used to be
as a quad core CPU can do thousands of TPS (which is a lot). Also are you sure you really want to do all this on the load balancer? Why not use the cluster for it instead?
Question: Does anyone one still sell decent PCI-E hardware SSL accelerator cards?
Persistence / Sticky / Affinity
Persistence is a feature that is required by many web applications.
Once a user has interacted with a particular server all subsequent
requests are sent to the same server thus persisting to that particular
server. It is normally required when the session state is stored
locally to the web server as opposed to a database.
Source IP based persistence: This is a simple and fast method, and these
days you are very unlikely to get a client IP change during a
session so it is perfectly safe to use. The old FUD about super or mega
proxies changing the client IP address is old news and hasn't been an
issue on the Internet for years. However, if you have several large
offices accessing a load balanced service from a large proxy i.e.
office with 1,000 users represented by a single public IP address you
will get very un-even load balancing.
Cookie based:
Layer 7 devices can take advantage of setting a load balancer cookie
with the persistence information. This works well as long as the
clients accept cookies, but obviously can only work with HTTP traffic.
Some vendors (Loadbalancer.org, Kemp, F5 & Citrix) can also do RDP cookies.
But what you meant to say was 'Will the
load balancer magically fix a poorly designed application?'
No. If your application does not have a persistent backend storage
device (a database) then you only get increased performance, failover
will lose the session.
"THIS IS THE SAME FOR ALL LOAD
BALANCERS - NO PERSISTENCE IN THE APPLICATION = NO SESSION FAILOVER"
- Malcolm Turnbull - (Founder of Loadbalancer.org)
Our Opinion: Their is nothing wrong with source IP
persistence, if you want cookies then use cookies that's fine... HAProxy
now has RDP cookie capability which is kewel....
Server health checking
Server health checking is the ability of the load balancer to run a
test against the servers to determine if they are providing service.
Ping: This is the most simple method,
however it is not very reliable as the server can be up whilst the web
service could be down. Also ICMP pings are often blocked by firewalls.
TCP connect:
This is a more sophisticated method which can check if a service is up
and running like a service on port 80 for web. i.e. try and open a
telnet connection to that port on the real server.
HTTP GET HEADER:
This will make a HTTP GET request to the web server and typically check
for a header response such as 200 OK.
HTTP GET CONTENTS:
(negotiate or regex) This will make a HTTP GET and check the actual
content body for a correct response. Can be useful to check a dynamic
web page that returns 'OK' only if some application health checks work
i.e. backend database query validates.
Custom check:
Custom health checks are often used for many services FTP / RADIUS /
SIP etc.. and you can often design your own checks.
Our
Opinion: These kind of
health checks are all fairly standard from load balancer vendors....
DNS Load Balancing
DNS load balancing
is to distribute requests to different servers though resolving the
domain name to different IP addresses of servers. When a DNS request
comes to the DNS server to resolve the domain name, it gives out one of
the server IP addresses based on scheduling strategies, such as simple
round-robin scheduling or geographical scheduling. This redirects the
request to one of the servers in a server group. Once the domain is
resolved to one of the servers in specified time-to-live, subsequent
requests from the clients using the same local caching DNS server are
sent to the same server.
More information is on the DNS Load Balancing page. PowerDNS link
here?
Our Opinion: Much
maligned this can be a great way to properly load balance a site...
What do you think Google uses!?.. OK they probably do a bit of GSLB and
health checking as well but at the end of the day I'm sure they use a
GSLB based DNS solution with an LVS backend.... but who am I to guess?
Checkout UltraDNS.com and AutoFailover.com...
Link Load Balancing
Link load balancing is to balance traffic among multiple links from
different ISPs or one ISP for better scalability and availability of
Internet connectivity, and also cost saving.
See Link Load Balancing for more
information. SmoothWall, FatPipe, Xrio et al provide appliances to do
this.
Our Opinion: Lots
of people want this.... and lots of people say that it doesn't work
very well... and lots of vendors say its a nightmare to try and
support...We think the core problem is that it only really works if your ISP supports the link balancing technology.
Load Balancing Optimization / Compression
Several vendors enable inline compression for optimization, as far as
load balancing appliances are concerned this means HTTP gzip
compression which all web servers do anyway, so its a bit daft...
F5 has a hardware card for this but it costs a fortune and is a bit daft
Our Opinion: Never heard of anything so stupid, also relates to the quote from Willy about CPU intensive tasks.
WAN Load Balancing Optimization / Compression
Several vendors enable inline compression for optimization with or
without link balancing, BlueCoat
and Riverbed come to mind....
mainly for corporate networks to save on bandwidth costs.
Our Opinion: Easy to fix at the application layer to be
honest, but a valuable tool in the corporate cost cutting department.
Database Load Balancing
Database load balancing
is to balance database access requests among a cluster of database
servers, in order to achieve database scalability and high
availability.
Our Opinion: Really
easy to do this with read only replicas.. kind of tricky for writeable
databases (Requires Oracle RAC / MySQL Cluster or similar)...
Or middleware like conexant or
MySQL load balancing proxy.
SIP Load Balancing
SIP Load Balancing is to load balance
SIP related services, in order to achieve performance scalability and
high availability of the services.
Our Opinion: Pretty
easy at Layer 4 as long as the servers support a backend persistent
data store... if they don't you may need SIP caller ID packet
inspection from an F5, Zeus or Citrix type appliance.
Update: We've just been working with Horms to get
SIP caller ID packet inspection for LVS in the Linux Kernel...ClusterScale appliances now support SIP load balancing for the telco market with NEBS AC & DC options.
Other options: OpenSIPS is more than a SIP proxy/router as it includes application-level functionalitiy.
Computing Load Balancing
Computing load balancing is to split a computational task across
many different nodes in the cluster, so that the whole cluster system
can provide increased performance. This kind of cluster systems is also
known as High-performance cluster,
which is most commonly used in scientific computing.
Beowulf type HPCs etc...
loads of stuff in Linux Magazine about this kind of specialist
area......
Free BSD stuff CARP, PF and hoststated
I don't know an awful lot about the FreeBSD side of things so here are some links:
PF: Address Pools and Load Balancing
Redundant firewalls with OpenBSD, CARP and pfsync
Hoststated
Our Opinion: What is FreeBSD? Isn't that what F5 use? :-).
Load Balancing Appliance vendors
A10 Networks
F5
Citrix
CISCO
Brocade (aka. Foundry)
Zeus (UK)
ClusterScale
Loadbalancer.org (UK, USA & Canada)
Kemp
CoyotePoint
Barracuda
CAI
Radware (Alteon)
Exceliance (France)
|