Fix Elasticsearch Shards Growing Too Large and Causing OOM (2026)

The Elasticsearch ClusterBlockException is occurring because the filesystem containing Elasticsearch’s data has exceeded its watermark.flood_stage threshold, preventing new writes and causing shards to become unassigned.

Common Causes and Fixes

Too many unassigned shards due to failed nodes or network issues:
- Diagnosis: Check the cluster health: GET _cluster/health?pretty. Look for a high number of unassigned_shards. Then, check the logs of the master node and affected data nodes for errors indicating node communication failures or disk issues.
- Fix: If nodes are down, bring them back online. If network issues persist, resolve them. Once nodes are back and communicating, Elasticsearch will attempt to reallocate shards. If the problem was transient and shards are now assigned, you’re good. If you need to force allocation of specific shards that won’t reallocate (e.g., a node is permanently gone), you can use POST _cluster/reroute?retry_failed=true. This tells Elasticsearch to retry allocating any shards that failed to assign.
- Why it works: Elasticsearch’s shard allocation system is designed to be resilient. When a node fails, it marks shards on that node as unassigned and waits for recovery or manual intervention. retry_failed prompts the allocator to re-evaluate these shards.
Disk space exhaustion on data nodes:
- Diagnosis: Check disk usage on all data nodes. Use df -h on Linux/macOS or check Disk Management on Windows. Also, check Elasticsearch’s specific disk usage: GET _cat/allocation?v. Look for nodes with high disk.indices and disk.used.
- Fix: Free up disk space. This could involve deleting old indices (e.g., DELETE /my-old-index-*), moving data to larger disks, or archiving data. To prevent this in the future, configure index lifecycle management (ILM) to automatically delete or move old indices. For example, to delete an index older than 30 days: DELETE /my-index-2023.10.* (assuming daily indices). A more automated approach using ILM is recommended for production.
- Why it works: Elasticsearch stops writing to prevent data corruption when disk space is critically low. Freeing space allows the watermark.flood_stage to drop below the threshold, re-enabling writes and shard allocation.
Large indices with high document counts and no shard routing:
- Diagnosis: Identify large indices using GET _cat/indices?v&s=docs.count:desc. Look for indices with billions of documents. Then, check shard sizes: GET _cat/shards?v&h=index,shard,prirep,state,docs.count,store.size.
- Fix: Re-index data into a new index with more primary shards, or split existing large indices. This is a complex operation and typically involves creating a new index with a desired number of primary shards and reindexing data from the old index to the new one. Example: POST _reindex { "source": { "index": "old-large-index" }, "dest": { "index": "new-index-with-more-shards" } }. This requires sufficient temporary disk space.
- Why it works: Distributing data across more primary shards allows for better parallel processing, more even disk utilization, and prevents any single shard from becoming too large to manage.
Unoptimized mapping leading to excessive fielddata or doc values:
- Diagnosis: Examine index mappings: GET /my-index/_mapping. Look for fields that are mapped as text with fielddata: true (which is deprecated and memory-intensive) or dynamic mappings that create many fields. High memory usage in Elasticsearch can also be a symptom.
- Fix: Optimize mappings. For text fields that need to be searched, use keyword type for exact matches or aggregations. For fields that are indexed but not searched, consider disabling _source or doc_values if appropriate. If fielddata is being used, migrate to doc_values (enabled by default for most types) or disable it if not needed for sorting/aggregations. Example: PUT /my-index/_mapping { "properties": { "my_field": { "type": "keyword" } } }.
- Why it works: Inefficient mappings can lead to large amounts of data being loaded into memory (fielddata) or on-disk structures (doc_values), consuming resources and potentially leading to OOM errors or disk pressure.
Too many small indices:
- Diagnosis: Check the number of indices: GET _cat/indices?v. A very large number of indices (thousands) can strain the master node and increase overhead for operations.
- Fix: Consolidate small indices into larger ones. This is often done using the _reindex API or by setting up ILM to merge indices. For example, if you have many daily indices for logs and want to consolidate them into weekly indices: POST _reindex { "source": { "index": "logstash-2023.10.*" }, "dest": { "index": "consolidated-logs-2023-W40" } }.
- Why it works: Each index has overhead. Consolidating reduces the total number of indices, lessening the load on cluster management operations and improving search performance.
High indexing rate causing temporary disk space spikes:
- Diagnosis: Monitor indexing rates using GET _cat/thread_pool/write?v. High write thread pool rejections can indicate indexing pressure. Also, check GET _nodes/stats/fs and GET _cat/indices?v&s=store.size:desc for indices that are growing rapidly.
- Fix: Adjust refresh_interval for indices experiencing high indexing rates. Increasing it from the default 1s to 30s or 60s can reduce the frequency of segment merges and disk I/O. Example: PUT /my-write-heavy-index/_settings { "index" : { "refresh_interval" : "30s" } }. Ensure you have enough disk capacity to handle temporary spikes during indexing.
- Why it works: The refresh_interval controls how often new documents become visible and how often new search segments are created on disk. A longer interval reduces the rate of segment creation, thus slowing down disk space consumption and I/O during intense indexing periods.

The next error you’ll likely encounter is a CircuitBreakerLimitError if memory usage becomes excessive, or continued ClusterBlockException if disk space is not reclaimed.