Cloudflare health checks aren’t just about knowing if your origin is up or down; they’re a proactive way to understand its actual performance and availability from Cloudflare’s perspective, letting you react before your users do.
Let’s watch this in action. Imagine a simple web server running on 192.0.2.10 on port 80, and we want to check if it’s serving a basic HTML page.
{
"name": "My Origin Health Check",
"protocol": "HTTP",
"port": 80,
"request_path": "/",
"interval": 60,
"timeout": 5,
"unhealthy_threshold": 3,
"healthy_threshold": 2,
"method": "GET",
"headers": {
"Host": "example.com"
},
"check_zones": [
"example.com"
],
"origins": [
"192.0.2.10"
]
}
This configuration tells Cloudflare to:
- Send a
GETrequest to/on port 80. - Set the
Hostheader toexample.com(crucial for virtual hosting). - Check every 60 seconds (
interval). - Consider the origin unhealthy if it fails to respond within 5 seconds (
timeout). - Declare an origin unhealthy after 3 consecutive failures (
unhealthy_threshold). - Declare an origin healthy after 2 consecutive successes (
healthy_threshold). - Perform these checks for any requests hitting
example.com. - Target
192.0.2.10as the origin IP.
The core problem health checks solve is the disconnect between your origin server and the global edge. Your server might be running fine, but network issues between Cloudflare’s nearest data center and your origin can make it appear unavailable to your users. Health checks bridge this gap by simulating user traffic from the edge, giving you a real-time, geographically diverse view of your origin’s health.
Internally, Cloudflare data centers globally execute these checks. When a health check fails, Cloudflare marks that origin as unhealthy for that specific data center. If an origin is marked unhealthy across a sufficient number of data centers, Cloudflare will stop sending traffic to it and instead route requests to a healthy origin (if you have a load balancing pool configured) or show a Cloudflare error page.
The request_path is more than just a URL; it’s a specific endpoint Cloudflare will hit. For example, checking / might be fine, but if your application’s health is actually determined by a dedicated /healthz endpoint that performs deeper checks, you should use that. The method is also key – GET is common for simple checks, but POST might be necessary if your health endpoint requires it.
The headers are critical for origins relying on Host headers for routing. Without the correct Host header, even if the IP is correct, your origin might return a default page or an error, leading to false negatives in your health checks.
You can also use more sophisticated checks. For instance, if your application returns specific content on a healthy response, you can configure a expected_body in the health check. Cloudflare will then not only check for a 2xx status code but also verify that the response body contains the specified string. This adds another layer of validation, ensuring not just that the server is responding, but that it’s responding with the correct content.
When an origin is marked unhealthy, Cloudflare doesn’t just abandon it. It continues to probe the origin at the configured interval. Once the healthy_threshold is met again (e.g., 2 consecutive successful checks), the origin is automatically re-added to the pool of healthy origins. This dynamic failover and failback is what makes health checks so powerful for maintaining high availability.
Many users overlook the timeout value. If your origin is slow to respond, even if it’s eventually successful, a low timeout can cause frequent false failures. Conversely, a very high timeout might mask underlying performance issues, making your origin appear healthy when it’s actually sluggish. Tuning this value to match your origin’s expected response time is crucial.
The next step is often configuring origin load balancing to leverage these health checks, allowing Cloudflare to automatically distribute traffic across multiple healthy origins.