Elasticsearch can lock itself into read-only mode when disk space gets too low, and that’s usually because the default disk watermarks are too conservative for your cluster’s growth.
The system has two main thresholds, the low watermark and the high watermark. When a node’s disk usage crosses the low watermark, Elasticsearch starts relocating shards away from that node to balance the load and prevent it from filling up completely. If that node’s disk usage continues to climb and crosses the high watermark, Elasticsearch will prevent new shards from being allocated to that node. If the disk usage still doesn’t decrease and reaches 100%, it triggers a cluster-wide read-only lock to prevent any further data writes, which can be a real pain to recover from.
Here’s how to tune these settings.
Common Causes for Hitting Watermarks
-
Actual Disk Full: The most straightforward reason is that the disk is genuinely running out of space.
- Diagnosis: SSH into the affected node and run
df -h. Check the output for the filesystem where Elasticsearch data is stored. - Fix: Free up space by deleting old indices, snapshots, or other unnecessary files. If this is a recurring issue, you’ll need to increase the disk size or add more nodes.
- Why it works: This directly addresses the root cause of the disk being full.
- Diagnosis: SSH into the affected node and run
-
Conservative Default Watermarks: Elasticsearch’s default watermarks are often set at 85% for
lowand 90% forhigh. For systems with rapid data ingestion or large shard sizes, these can be hit very quickly.- Diagnosis: Check your current cluster settings using
curl -X GET "localhost:9200/_cluster/settings?pretty". Look forcluster.routing.allocation.disk.watermark.lowandcluster.routing.allocation.disk.watermark.high. - Fix: Increase the watermarks. For example, to set the low watermark to 90% and the high watermark to 95%:
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d' { "persistent": { "cluster.routing.allocation.disk.watermark.low": "90%", "cluster.routing.allocation.disk.watermark.high": "95%" } } ' - Why it works: By raising the thresholds, you give your cluster more headroom before it starts aggressively relocating shards or preventing new allocations, allowing it to manage disk space more gracefully with your current data growth rate.
- Diagnosis: Check your current cluster settings using
-
Uneven Shard Distribution: A single node might accumulate too many shards, causing its disk to fill up faster than others, even if the overall cluster disk usage is low.
- Diagnosis: Use the Cat Shards API:
curl -X GET "localhost:9200/_cat/shards?v&h=index,shard,prirep,state,node,size". Sort or group bynodeto see which nodes have the most data. - Fix: Ensure your indexing strategy distributes shards evenly. This might involve adjusting shard counts per index or enabling shard balancing features if available in your version. If a node is consistently overloaded, consider moving it to a larger disk or rebalancing shards manually.
- Why it works: Distributing shards more evenly prevents any single node from becoming a bottleneck and hitting its disk limits prematurely.
- Diagnosis: Use the Cat Shards API:
-
Large Shard Sizes: If your individual shards are very large (e.g., tens or hundreds of GB), even a moderate number of shards can consume significant disk space on a node.
- Diagnosis: Again, use
curl -X GET "localhost:9200/_cat/shards?v&h=index,shard,prirep,state,node,size". Examine thesizecolumn for individual shards. - Fix: Consider reducing the number of shards per index. For large time-series data, this often means increasing the granularity of your time-based indices (e.g., daily or hourly indices instead of monthly). This will create more, smaller shards, which are easier for Elasticsearch to manage and relocate.
- Why it works: Smaller shards are more portable and less impactful if a relocation is needed. It also prevents a single large shard from consuming a disproportionate amount of disk space on any given node.
- Diagnosis: Again, use
-
Stale Indices/Data Not Being Deleted: Over time, old indices that are no longer needed can accumulate, consuming disk space.
- Diagnosis: Use the Cat Indices API:
curl -X GET "localhost:9200/_cat/indices?v&h=index,creation.date.keyword,health,docs.count,store.size". Look for old indices that are no longer accessed. - Fix: Implement an index lifecycle management (ILM) policy to automatically delete or move old indices to cheaper storage. Manually delete unneeded indices:
curl -X DELETE "localhost:9200/my-old-index-2022.01.*". - Why it works: Regularly cleaning up old data directly frees up disk space, preventing it from contributing to watermark issues.
- Diagnosis: Use the Cat Indices API:
-
Corrupted Shard Data: In rare cases, a shard’s data might become corrupted, reporting an incorrect size or causing allocation issues.
- Diagnosis: Check the Elasticsearch logs on the affected node for any shard-related errors or corruption warnings. The
_cat/shardsAPI might also show shards in anUNASSIGNEDstate or with unusual sizes. - Fix: If a shard is corrupted, you might need to force merge it (if possible) or, as a last resort, delete the corrupted shard and restore it from a snapshot.
- Why it works: Removing or repairing corrupted data resolves the incorrect disk usage reporting and allows for proper shard management.
- Diagnosis: Check the Elasticsearch logs on the affected node for any shard-related errors or corruption warnings. The
-
Disk Watermark Configuration on Individual Nodes: While cluster-wide settings are common, you can also set disk watermarks on a per-node basis. If these are set incorrectly on a specific node, it could cause issues.
- Diagnosis: Check node-specific settings:
curl -X GET "localhost:9200/_nodes/stats/fs?pretty". Look forwatermark.lowandwatermark.highunder thefssection for individual nodes. - Fix: If node-specific watermarks are present and misconfigured, remove them to fall back to cluster-level settings, or adjust them accordingly. You can remove node-specific settings by updating the cluster settings to remove the node-specific keys, or by using the
transientsetting to override them. - Why it works: This ensures that node-specific overrides don’t interfere with the intended cluster-wide disk management strategy.
- Diagnosis: Check node-specific settings:
After adjusting your watermarks, you might see a CLUSTER_RECOVERED_FROM_READONLY event in your logs if the cluster was previously in read-only mode.