If you’re used to configuring f5 LTM load balancers, you’re probably used to the idea that you normally set two health checks for each VIP you have. The first is at the node level, often just an ICMP ping, which checks that the host itself is up. If that health check fails and node is deemed to be down, it is implied that any ports currently reference on that same host should be marked as down immediately. Similarly there’s a service-level health check where you’d check the actual service you’re load balancing to – e.g. an HTTP get, or a TCP check, DNS test, or other custom check to make sure that the server is responding properly.
When you configure an A10 it feels like you have the same, but actually there’s one that many people don’t even notice.
In a previous post I whined that the A10 load balancers by default only need to see a single successful health check from a service in order to mark it as up. I argued that that’s not a good idea, especially if a service is bouncing up and down. The good news was that you could set a global default value for this so that all your new health checks will be created with a higher value (I suggest 5–10).
The bad news is that this might have some side effects, which is what I’m sharing today.
A quick post today, about how the A10 load balancer handles the recovery of a server based on health checks. Kind of a warning actually.
Whether you use A10, f5 or some other load balancer, you’re probably used to the idea of health monitors, or “health checks”. The load balancer periodically performs some kind of connectivity test to the servers that are used to service a given VIP (virtual IP), then if a certain number of health checks are unsuccessful, the service on that particular server is marked unavailable and it is no longer used to service requests to that VIP.
Sounds simple enough, but things are rarely straightforward – and this is no exception.
Finally this week I got myself in gear and decided it was time to have a play with Ansible. I found an installation guide on the Ansible documentation site and tried to follow it.
Here follows the fun I had; hopefully I can fill in a few gaps for anybody else going down this path.