HAProxy Routing Basics

2022-04-23

The essential qualities of software architectures can be lost in a veritable soup of acronyms, cloud provider jargon, and over-engineering. I have recently been working up a simple architecture and routing scheme using HAProxy that is both flexible and scalable.

HAProxy, Briefly

HAProxy feels like a Swiss army knife of web networking. It can do load balancing, TLS termination, rate limiting, caching, and circuit-breaking (among a veritable laundry list of other functionality). For many cases where I would have previously used Nginx, I would instead reach for HAProxy (except of course serving static files). I have used it in a few different instances and each time I dig into accomplishing some new feat I come away more impressed with it. It seems to be thoughtfully designed and very robust. Here I outline a few different uses I have had for it and some thoughts on using the proposed architectures.

Single Machine Round Robin

I have gone through a configuration in the past to reverse proxy Python with Nginx, this is essentially the same idea: an application might be slow so use multiple instances to serve requests. Imagine each instance of the application listens on a different port, say 8080, 8081, and 8082. While HAProxy serves internet-facing requests by listening on port 80 for HTTP. On the right hand side is the necessary configuration for HAProxy, from /etc/haproxy/haproxy.cfg.

global
    daemon
    maxconn 256

defaults
    mode http

frontend http-in
    bind *:80
    default_backend servers

backend servers
   balance roundrobin
   server app1 127.0.0.1:8080
   server app2 127.0.0.1:8081
   server app3 127.0.0.1:8082

This is almost the sweet spot for me in terms of complexity. The one addition I think is worth making is the use of network namespaces. systemd services allow for things like PrivateNetworking to isolate a process from other processes and network interfaces; alternatively you can create your own namespace with ip netns add my-new-namespace. HAProxy can flow network traffic directly into such a namespace, without the need for creating virtual interfaces or anything too arcane.

As an example, here I create a network namespace and launch a process inside of it. The server will not be reachable outside the new namespace and the isolated process will not be able to access the network.

# ip netns add my-new-namespace

    need to turn on the loopback device inside the namespace

# ip netns exec my-new-namespace ip link set dev lo up

# systemd-run -p PrivateNetwork=yes \
              -p NetworkNamespacePath=/var/run/netns/my-new-namespace \
    python -m http.server --directory /var/www

Configuring HAProxy to direct traffic into this namespace without spoiling the isolation is a cinch. If the above process is running on port 8000, a complete configuration looks like this:

global
    daemon
    maxconn     256

defaults
    mode http

frontend http-in
    bind *:80
    default_backend    servers

backend servers
    server directory-server 127.0.0.1:8000 namespace my-new-namespace

Multiple Machines, Limited Namespacing

I tried this design and it was maybe an hour before it started to chafe — I wouldn't recommend this configuration. If you can, stick with one machine and keep things simple. If you absolutely need more than one machine then it is worth going for the maximally complex example given below. The issue I have with this organization is how limited the ability to network the worker machine (Machine B) is. Network namespacing is a huge boon to both security and flexibility of applications and configuring the worker to parity with Machine A is a real pain. Suddenly you are faced with the issue of port conflicts and shared interfaces; by adding a second machine you take on a lot more complexity. It also presents the problem of making more changes on Machine A to accommodate updates to Machine B, I would prefer to keep as much configuration local as possible.

I include this here because it is an obvious gradual step once I had a private VPN tunnel and database replication set up. If you imagine app 3 and app 4 on Machine B are running on ports 8080 and 8081 then the following configuration describes the diagram. Purely for example purposes I dispatch to the appropriate "app" based on the requested path:

global
    daemon
    maxconn 256

defaults
    mode http

frontend http-in
    bind *:80
    acl is_app3 path_beg /app3
    acl is_app4 path_beg /app4
    use_backend app3 if is_app3
    use_backend app4 if is_app4
    default_backend default

backend app3
    server app3 10.0.0.2:8080

backend app4
    server app4 10.0.0.2:8081

backend default
    server default 127.0.0.1:8000 namespace my-new-namespace

Maximum Complexity

Finally there is the most complex configuration that I can imagine being broadly useful. While I tend to think scaling a single server vertically makes more sense than adding more machines there are certainly cases where it is necessary to add multiple machines. There is also something appealing about this architecture for how regular it is (probably not a convincing technical argument!).

The key addition in this design is to also use HAProxy on the machines behind the public-facing load balancer. By layering the load balancers/proxy servers you achieve more flexibility in how the workers serve requests, partition networks, and segment traffic. The ability to route traffic into a namespace is limited to namespaces on the same host as the HAProxy server. This would limit the feature's usefulness if it weren't so easy to insert another proxy per machine. Here I route any request for, e.g. example.com/remote/some/resource to Machine B which does whatever routing is necessary (including its own namespacing here).

# Machine B haproxy.cfg

global
    daemon
    maxconn 256

defaults
    mode http

frontend http-in
    bind 10.0.0.2:80
    default_backend default

backend default
    server default 127.0.0.1:8000 namespace machine-B-namespace


# Machine A haproxy.cfg

global
    daemon
    maxconn 256

defaults
    mode http

frontend http-in
    bind *:80
    acl is_remote path_beg /remote
    use_backend remote if is_remote
    default_backend default

backend remote
    server remote 10.0.0.2

backend default
    server default 127.0.0.1:8000 namespace my-new-namespace

As a more concrete example you might consider the following scenario. You have two classes of customers accessing a web application, "typical" and "deluxe" — where "deluxe" pays you more money for faster processing or for GDPR compliance or … whatever you can imagine. Users log-in and should have their requests serviced by the appropriate server resource.

Rather than draw out an admittedly contrived example by throwing a full-blown authentication server like Keycloak at the problem I am going to use HAProxy's built-in basic authentication capabilities and trust you believe me when I say the idea is the same. Users log-in, the proxy server picks out some piece of information; here it is the basic authentication username, but it could be a JWT scope value or, really, anything that cannot be forged by the user. Requests are routed to different servers based on that value.

global
    daemon
    maxconn 256

defaults
    mode http

frontend http-in
    bind *:80
    http-request auth unless { http_auth(mycredentials) }
    use_backend %[http_auth_group(mycredentials),map(/etc/haproxy/maps/deluxe.map,default)]

backend deluxe
    server remote  10.0.0.2

backend default
    server default 127.0.0.1:8000

userlist mycredentials
    user vip     insecure-password ipaygoodmoneyforthis
    user john    insecure-password somePassword

The auth unless bit will ensure that users are prompted with an HTTP basic authentication prompt unless they have already logged in. Once logged in the http_auth_group(mycredentials) will contain the username. The call to map(…) will search the map file provided for a corresponding value, a map file might look like this:

vip deluxe

If the user making the request happens to be the user "vip" then the request will route to the backend called "deluxe" which happens to be at 10.0.0.2 here. If it is anything else it will use the "default" backend. Lest you think this approach is too brittle or static, map files can be updated at runtime. This approach also works surprisingly well for legacy applications. Maybe the application doesn't support multi-tenancy, or runs with a dedicated database configured per-application instance; just direct traffic based on map/lookup (maybe it is login, or client IP address, or based on a subdomain, or …) and move on with your life.

Thoughts

One thing I really like about this approach is that it works great on a single machine. You could easily create multiple network namespaces on a single machine and route into them based on user. You could launch a second instance of a service and give it a higher CPU quota, or preferential IO capabilities. Multiple machines are not necessary for granular controls!

There are seemingly endless possibilities for introspecting traffic and routing it with HAProxy. It is fun to read about implementing rate limits for API access but I also think, for most uses, applications never reach such a level of complexity. What if instead of endlessly workshopping operations and architectures we wrote great software? How much better would things be if we wrote some tests or profiled our code instead of rushing to implement whatever we heard about at the most recent conference?

I can imagine scenarios where this architecture would not work but they are not exactly problems with the design. They are the sort of problems you would love to have — "Oh no! We have millions of paying customers who are using the product all the time". I won't speculate how you might solve such a problem because it is entirely dependent on the specifics of a given scenario and application, there isn't a one-size-fits-all solution. I will say the "stack load balancers on top of each other" approach works surprisingly well, move applications or the database off the primary load balancer and to a worker machine behind it and you buy yourself more headroom. I might read that Litestream is getting live read replication and think "that sounds awesome!". But then, doesn't it seem silly to rush for a solution to an unstated problem? Maybe database reads are never the problem, better check first. Maybe I never outgrow a single machine.

One thing I still need practice with before I can more forcefully argue for an architecture like this is managing zero-downtime scenarios and blue-green deployments. Both of these are well-trodden ground with HAProxy so it isn't as though there is much to invent, just more close reading of the documentation.