HAProxy will be the first web-server for the HTTP client.
It will be the TLS/SSL termination. It will decide if the request is accepted or rejected,and finaly will pass the request to the upstream server or servers.
It is acting as a proxy, with load-balancing and fault-tolerance capabilities.
A frontend is an entry point. HAProxy is listening on the TCP or HTTP layer, on one or more `IP:PORT` pairs, with options regarding logs, timeouts, supported, protocols and much more.
Many frontend can coexist in an HAProxy instance,to accomodate different use cases. For example : a traditionnal web-site in one, and an API for third-parties in another one, each with their settings and configuration.
HAProxy is able to parse and process the full TCP or HTTP request. It exposes an internal API to change the request or the response, decide how to deliver them upstream or not… in a very optimized and reliable manner.
A backend is an exit point. There must be one for each group of final webservers. If there is only one server in a backend, it will deal with every request, but if there are more than one, HAProxy will balance the requests, according to an algorithm.
That's where we begin to mention load-balancing. And when we introduce som logic to deal with misbeahaving or missing servers, it becomes fault-tolerant.
Varnish is going to store in cache the result of some requests. Later on, a client who makes the same request might get a result from the cache if it is preset andfresh enough.
With its default configuration, Varnish has a lot of good practice already in place and jst a few adjustments might be necessary to work with most of the situations.
Every step of the request/response processing is done by functions that can be customized by something that looks a bit like inheritance in programming. At startup, everything is validated and compiled into optimized code.
Like HAProxy, Varnish parses the whole request/response inside the functions, so we can decide if the request/response needs to be modified (like adding or removeing headers), if a response can be served from the cache, if it need to talk to the final server, and if the responsae can be stored in cache for future use.
Varnish stores the content, its objects, into memory to be really fast. if you have a lot of trafic, give it enough RAM to keep a lot of content available.
In our case, after accepting the request from the HTTP client, HAProxy will pass it to Varnish, via a dedicated backend.
We have chosen to place HAProxy and Varnish on the server server, and connect them with unix sockets, but it's an implementation detail andwe could have placed them on separate servers, communicating through regular TCP sockets.
Once the request is passed to Varnish, it will parse it too and decide if it is able to serve a resposne from the cache or pass it to the final server.
In the most common case with Varnish, the request to the final server and passed directly. It is evencapable of some rather basic load-balancing. But since we have HAProxy at hands and it is more capable in this area, we have decided to pass the request back to HAProxy.
To prevent loops, and to bypass the processing that has already been done, the request will reenter HAProxy throughanother frontend. This one is much simpler and its responsibility is basically to choose which backend to use for each request. If we manage sites and apps on more than one set of servers, we have to create as much backends as we have sets of servers.
In the end, the request is passed to a server in a backend, be it a static web site, a dynamic application programmed with a framework, or the ingress of a big Kubernetes cluster.
La réponse fera ensuite le chemin inverse, en revenant à HAProxy, puis Varnish (qui décidera s'il met la réponse en cache), puis HAProxy puis le client HTTP qui a fait initialement la requête.
The response will eventually travl the same way backwards, through HAProxy, Varnish (which will decide to store it in the cache or not) and HAProxy again to the original HTTP client.
Even if we have only one frontend to manage all the requests with the same settings, we can have many groups of web-servers,for differents web-apps of web-sites. In our case, we have many clients on one HAProxy, with one frontend, but many backends, one for each client and their own servers.
HAProxy does not have a concept of VirtualHost as Apache or Nginx do, but we can use some lower-levels conditionals to have a similar result.
We can write (at least) an ACL t detect if the Host header is for a web site or another and then use this ACL to trigger some processing adnd chose the backend to use.
At the TCP level, when HAProxy talks to Varnish, then Varnish to HAproxy and finally HAroxy to the final web server, they are all seen as regular HTTP clients.
Without any modification, each element reports the IP of the previous elements as the client IP, and the final server thinks that HAProxy is the only client accessing it. It's bad for IP based filtering, logging…
At the HTTP level, we've had the `X-Forwarded-For` header for a long time. I you look at its presence and you know how to parse it, you can use the ocrrect value in your web-server of application. It's cumbersome, and error-prone, and at the TCP level, this is invisible.
The PROXY protocol is simple extension of the TCP protocol. It add the same kinf of header, but at the TCP level. The downside is that both parties must support the PROXY protocol andbe configured to use it. The upside is that its completely transparent after that. There is nothing to do at the application level.
The PROXY protol has been designed by Willy Tarreau, creator of HAProxy, in 2010. It has sincebeen adopted by many products like Varnish, Apache, Nginx, Postfix, Docecot…
In HAProxy we find in the backend to Varnish, in the frontend used by Varnish to pass the request to HAProxy, and possibly ibn the backend to the final web-servers.
With HAProxy, for the listening side we use the `accept-proxy` option in the frontend section and for the emitting side we use the `send-proxy-v2` option in the backendsection.
Even though it's a valuable optimization, the PROXY protocol is not supported by many tools, especially some very low level ones, like a forged HTTP request on a telnet connection.
Note: If you use Apache, I encourage you to take a look at the "ForensicLog" module. It adds a special log where you can find a complete trace of each request with all the headers. It is specially useful to see what arrives at the end of the chain. I don't know if something similar exists for Nginx or other web servers.
Let'sremember that the HTTP standard has normalized a list of headers. For example : `Host`, `Set-Cookie`, `Cache-Control`…
But it also normalized a way to use custom, non-standard, HTTP headers with an `X-` prefix. Some have almost become standard and are used very frequently.
That last header is important when the request goes to an application that enforces HTTPS. If it receives a request from HAProxy on a clear-text HTTP connection, it ight trigger a redirect. But many framework detect the `X-Forwarded-Proto https` header and understand the external part of the request was encrypted.
It is very useful to mark an incomming request with a unique identifier that can be transmitted from proxy to proxy, un tile the final server. It even canbe sent back by the application and be tracable all the way back to the original client.
:warning: It's probably not a good idea to do that in production, but it can be very useful to see on the client side exactly what happened inside HAproxy.
When a web site or web application relies on multiple webservers to handle the requests, we can use HAProxy to do the load-balancing.
When possible, we prefer using the `round-robin` algorithm. It's the simplest.
But when the application has trouble with persisted data like sessions or stored files, we can set an active-backup configuration. HAproxy passes all the requests to the same server until it's failing then moves to the next one.
To prevent our HAProxy+Varnish server from turning into a "Single Point of Failure", we have installed two of them, in two different networks and datacenters, and we use a round-robin DNS.
If a server is not available, we need to change the DNS zone to disable the faulty server. It takes time to detect,change and propagate. But we use virtual servers on redundant hardware. Over the last several years, this has proven to be very reliable. And it is also very easy to change and adapt.
It would definitely be possible to have an "active-standby" setup, with a virtual IP (with keepalived/vrrp), and have an automatic failover if the main server fails.
or we could go even further with an "active-active" setup with 2 "layer 4" HAProxy servers, then 2 or more "layer 7" Haproxy servers.
Those options (and some more that I ddn't cover) allow for a much faster recovery in case of an incident, but they are much more complex.
For example, if a website is in maintenance mode (detailed further), we can bypass this for a list of IP addresses. Very useful to check on your deployment before opening to the general public.
For some local monitoring tools (like Munin) or for ACME challenges, it is usefull to directly proxy the requests to a local web-server (like Apache or Nginx), instead or going through Varnish or other servers.
For a global "maintenance mode", we use a special backend that doesn't specify any server, which will trigger 503 errors. If we have defined a custom error page for this backend, it will be displayed
For a "maintenance mode" per site, we also need a custom backend with the same principle as before, and we need to use the domain ACL to restict to this site only.