The most surprising thing about managing Consul resources with Terraform is how much of its power comes from treating Consul itself as a data source for Terraform, rather than just a target for configuration.
Let’s see this in action. Imagine you have a service registered in Consul, maybe my-app-v1, and you want to deploy a new version, my-app-v2, but only after my-app-v1 is no longer healthy. Terraform can’t directly tell you my-app-v1’s health state before it tries to deploy my-app-v2. But Consul can.
Here’s a Terraform snippet that leverages Consul’s health checks:
# main.tf
provider "consul" {
address = "127.0.0.1:8500"
}
data "consul_service" "old_app" {
name = "my-app-v1"
}
resource "null_resource" "wait_for_old_app_to_die" {
# This null_resource doesn't do anything itself, but its triggers
# will cause Terraform to re-evaluate the data source.
triggers = {
# We're checking the number of healthy instances of the old service.
# Terraform will re-run this data source lookup and re-evaluate
# the 'count' condition whenever the service's health status changes.
service_health = data.consul_service.old_app.health_service_instances.passing
}
# This provisioner runs only when the 'triggers' change.
# In this case, it's triggered when the number of passing instances changes.
provisioner "local-exec" {
command = "echo 'Old app health changed. Current passing: ${data.consul_service.old_app.health_service_instances.passing}'"
# This condition ensures the provisioner only executes when the old app is *not* passing.
# We want to wait until there are 0 passing instances.
when = create
interpreter = ["bash", "-c"]
# The actual deployment of the new app would happen *after* this point
# in the Terraform plan, and would depend on this null_resource.
}
}
# This is a placeholder for your new application deployment.
# In a real scenario, this would be a Kubernetes deployment, EC2 instances, etc.
resource "null_resource" "new_app_deployment" {
# This resource depends on the null_resource above, ensuring
# the old app is no longer passing before we attempt to deploy the new one.
depends_on = [null_resource.wait_for_old_app_to_die]
provisioner "local-exec" {
command = "echo 'Deploying my-app-v2 now that my-app-v1 is gone.'"
when = create
interpreter = ["bash", "-c"]
}
}
# Example of registering a service in Consul using Terraform
resource "consul_service" "my_new_app" {
name = "my-app-v2"
port = 8080
id = "my-app-v2-instance-1"
check {
id = "my-app-v2-health"
name = "HTTP Health Check"
http = "http://localhost:8080/health"
interval = "10s"
timeout = "1s"
}
}
In this example, data "consul_service" "old_app" queries Consul for information about the my-app-v1 service. Terraform doesn’t manage this service’s lifecycle directly; it reads its current state from Consul. The null_resource then uses this data to decide when to proceed. The triggers block is key here: Terraform will continuously re-evaluate data.consul_service.old_app as long as the number of passing instances (health_service_instances.passing) changes. When that number drops to zero (or below some threshold you define), the null_resource’s provisioner executes, signaling that it’s safe to deploy my-app-v2.
This pattern of using Consul as a dynamic data source for Terraform is incredibly powerful for blue-green deployments, canary releases, or even just ensuring a service is fully registered and healthy before its dependencies are updated.
The core problem this solves is bridging the gap between declarative infrastructure (Terraform) and dynamic service state (Consul). Terraform excels at defining what should exist and how it should be configured. Consul excels at tracking the real-time health and availability of services. By treating Consul services as data sources, Terraform can react to the actual state of your services, not just their desired configuration.
Let’s break down the mental model:
- Consul as the Source of Truth for Service State: Consul holds the definitive, real-time status of your services: which are running, where they are, and crucially, if they are healthy.
- Terraform as the Orchestrator: Terraform uses this real-time data to make decisions about deploying new infrastructure or updating existing resources. It reads Consul’s state via data sources.
- Data Sources as Connectors: The
consul_serviceandconsul_key_prefixdata sources are your primary tools. They allow Terraform to query Consul for specific information without trying to manage that information’s lifecycle. - Resource Dependencies and Triggers: You use
depends_onandtriggerswithin Terraform resources to enforce sequencing based on Consul’s state. For example, a new deployment resource might depend on anull_resourcethat waits for a specific health check from a Consul data source to pass or fail. - Managing Consul Resources Directly: You can also use Terraform to manage Consul itself – creating services, health checks, KV pairs, ACLs, and more. This is where you define the desired state of Consul’s configuration.
When you query a consul_service data source, Terraform performs an API call to your Consul agent (or server). It retrieves a JSON object representing the service, its checks, and their current statuses. This JSON is then parsed into attributes that Terraform can use. For instance, data.consul_service.old_app.health_service_instances.passing directly accesses the count of healthy instances of that service as reported by Consul.
The real magic is how Terraform’s plan and apply cycles interact with this. If the number of passing instances of my-app-v1 changes between Terraform runs, the triggers block in null_resource.wait_for_old_app_to_die will detect a change. This will cause Terraform to re-evaluate the plan. If the condition for proceeding (e.g., health_service_instances.passing == 0) is met, the new_app_deployment resource, which depends on the wait_for_old_app_to_die resource, will then be included in the plan to be applied.
A subtle but critical point is how Terraform handles the refresh phase. When you run terraform plan or terraform apply, Terraform first refreshes the state of your managed resources by querying the actual infrastructure. When you use data sources like consul_service, this refresh mechanism is also triggered for the data source itself, ensuring that Terraform is always working with the most up-to-date information from Consul before it makes any decisions about what to create, update, or destroy. This is why the triggers mechanism works so well for reacting to dynamic changes.
The next logical step after managing service deployments based on health is to automate the registration and deregistration of those services in Consul using Terraform itself, ensuring consistency between your infrastructure code and your service discovery.