Implement Hot-Warm-Cold Architecture in Elasticsearch (2026)

Elasticsearch’s hot-warm-cold architecture is less about speed and more about cost-effectiveness and efficient data lifecycle management.

Let’s see it in action. Imagine you have logs from your web servers. The most recent logs (hot) are accessed frequently for real-time debugging and analysis. Older logs (warm) are still queried, but less often, perhaps for daily or weekly trend analysis. The oldest logs (cold) are rarely accessed, mostly for compliance or historical review.

Here’s a simplified Elasticsearch cluster setup that embodies this:

// _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.disk.watermark.low": "85%",
    "cluster.routing.allocation.disk.watermark.high": "90%",
    "cluster.routing.allocation.disk.watermark.flood_stage": "95%"
  }
}

// Index Lifecycle Management (ILM) Policy Example
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "set_priority": {
            "priority": 100
          },
          "rollover": {
            "max_age": "1d",
            "max_docs": 1000000,
            "max_size": "50gb"
          }
        }
      },
      "warm": {
        "min_age": "1d",
        "actions": {
          "set_priority": {
            "priority": 50
          },
          "shrink": {
            "number_of_shards": 1
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "set_read_only": {}
        }
      },
      "cold": {
        "min_age": "7d",
        "actions": {
          "set_priority": {
            "priority": 0
          },
          "freeze": {},
          "searchable_snapshot": {
            "snapshot_repository": "my_cold_storage_repo"
          }
        }
      },
      "delete": {
        "min_age": "30d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

// Node Attributes (example for node roles)
// In elasticsearch.yml:
node.roles: [ "master", "data_hot" ]
// On another node:
node.roles: [ "data_warm" ]
// On yet another node:
node.roles: [ "data_cold" ]

The core problem this solves is the increasing cost and decreasing performance of storing massive amounts of historical data. As indices grow and data ages, disk space becomes a premium, query performance degrades, and the hardware required for "hot" performance becomes overkill. Hot-warm-cold allows you to use different hardware profiles for different data ages.

The fundamental mechanism is Index Lifecycle Management (ILM). You define a policy that dictates how indices transition through phases: hot, warm, cold, and delete. Each phase has specific actions. In the hot phase, new indices are written to fast, expensive SSDs. When an index meets criteria like age (e.g., max_age: "1d") or size (e.g., max_size: "50gb"), it rolls over to a new index, and ILM automatically applies the next phase’s actions to the old one.

In the warm phase, indices are moved to less expensive hardware. Actions like shrink reduce the number of primary shards (e.g., number_of_shards: 1), saving resources. forcemerge consolidates segments, improving query performance for this less frequently accessed data without the need for real-time indexing. set_read_only prevents further writes.

The cold phase is where storage costs plummet. Indices can be freezed, which unloads them from RAM but keeps them searchable. Even more powerfully, searchable_snapshot allows you to mount indices from a snapshot stored in cheap object storage (like S3 or GCS) without needing to restore them fully, making historical data accessible on demand without consuming valuable cluster disk space.

Finally, the delete phase automatically purges data that’s no longer needed, preventing indefinite storage costs. Node roles (data_hot, data_warm, data_cold) are crucial for directing indices to the appropriate hardware profiles. Elasticsearch’s shard allocation filters, guided by these roles and ILM’s phase actions, ensure data lands on the right nodes. The disk watermarks (low, high, flood_stage) are system-wide safety nets, preventing nodes from running out of disk space by moving shards away from full nodes.

What most people don’t realize is that freezing an index doesn’t just save disk space; it also dramatically reduces its memory footprint. When an index is frozen, its data structures that are typically held in the JVM heap (like field data and segment information) are unloaded. This means that a cluster with many frozen indices can support a much larger total data volume while still having ample heap memory available for active, hot indices and search operations. It’s a direct trade-off: reduced search latency on frozen indices in exchange for massive memory savings.

The next logical step after mastering ILM is understanding how to integrate external object storage for your cold phase snapshots.