The Elasticsearch circuit breaker exceptions mean that the JVM heap is running out of memory due to an Elasticsearch operation, and the circuit breaker is preventing further operations to avoid a full crash.

Cause 1: Large or Complex Queries

Diagnosis: Elasticsearch logs will show circuit breaker exceptions with FIELD_DATA or REQUEST circuit breakers. You can also check JVM heap usage with GET _nodes/stats/jvm. High heap usage is the indicator.

Fix:

  1. For FIELD_DATA: Disable fielddata for fields that don’t need it, especially text fields. Edit your index mapping:
    PUT /my_index/_mapping
    {
      "properties": {
        "my_text_field": {
          "type": "text",
          "fielddata": false
        }
      }
    }
    
    This prevents Elasticsearch from loading field data into heap memory, which is often the culprit for sorting or aggregations on high-cardinality text fields.
  2. For REQUEST: Optimize your queries. Avoid GET requests for very large result sets. Use scroll API for deep pagination or search_after for real-time deep pagination. Reduce the size of your _source if you don’t need the full document. Limit the number of fields returned using _source filtering:
    GET /my_index/_search
    {
      "_source": ["field1", "field2"],
      "query": {
        "match_all": {}
      }
    }
    
    This reduces the amount of data that needs to be processed and held in memory for the request.

Why it works: Fielddata and request payloads can consume significant heap. Disabling fielddata or optimizing queries reduces the memory footprint of these operations.

Cause 2: Excessive Shard Size

Diagnosis: Check shard sizes using GET _cat/shards?v. Large shards (over 50GB is a common threshold) can lead to increased memory pressure, especially during rebalancing or recovery.

Fix: Increase the number of shards for your index. This distributes the data across more smaller shards, reducing the memory needed per shard.

PUT /my_index/_settings
{
  "index": {
    "number_of_shards": 6,
    "number_of_replicas": 1
  }
}

(Note: This changes settings for new indices if applied to a template, or requires reindexing for existing indices to take effect.)

Why it works: Smaller shards require less memory to manage and process, reducing the likelihood of hitting circuit breaker limits during operations that involve many shards.

Cause 3: Insufficient JVM Heap Size

Diagnosis: Monitor JVM heap usage via GET _nodes/stats/jvm. If the heap is consistently near 100%, it’s too small. Also, check ES_HEAP_SIZE in your elasticsearch.yml or environment variables.

Fix: Increase the JVM heap size. For production, it’s recommended to set it to at least 50% of your system RAM, but no more than 30-32GB. Edit jvm.options (usually config/jvm.options or config/jvm.options.d/jvm.options):

-Xms4g
-Xmx4g

(This example sets heap to 4GB. Restart Elasticsearch for changes to take effect.)

Why it works: A larger heap provides more memory for Elasticsearch to operate, reducing the chance of operations exceeding available memory and triggering circuit breakers.

Cause 4: Too Many Open File Descriptors

Diagnosis: Check the number of open file descriptors for the Elasticsearch process.

sudo lsof -p <elasticsearch_pid> | wc -l

If this number is very high and approaching your system’s limit, it can indirectly lead to memory issues as file handles consume resources.

Fix: Increase the nofile limit for the Elasticsearch user. Edit /etc/security/limits.conf:

elasticsearch  soft  nofile  65536
elasticsearch  hard  nofile  65536

Restart Elasticsearch.

Why it works: Elasticsearch uses file descriptors for segments, network connections, and other resources. A sufficient limit prevents resource exhaustion that could indirectly impact memory management.

Cause 5: Slow Disk I/O

Diagnosis: Monitor disk I/O performance. High I/O wait times can cause operations to block, accumulating in memory and eventually tripping circuit breakers. Use iostat -xz 1 or cloud provider monitoring.

Fix: Upgrade to faster storage (e.g., SSDs) or optimize your I/O patterns by reducing the number of shards per node or improving indexing strategies. For example, if you’re indexing very rapidly, consider using a dedicated indexing node.

Why it works: Faster disk I/O allows Elasticsearch to flush data and retrieve it more quickly, reducing the time data spends in memory waiting for disk operations.

Cause 6: Too Many Indices or Shards Per Node

Diagnosis: Check the number of indices and shards per node with GET _cat/indices?v and GET _cat/shards?v. A common guideline is to keep shards per node below 1000, and ideally much lower for optimal performance.

Fix: Consolidate indices. If you have many time-based indices that are no longer actively written to, consider using Index Lifecycle Management (ILM) to merge them or delete old data. For example, you can use _shrink API to reduce shard count for older indices.

POST /my_old_index/_shrink/my_shrunk_index
{
  "settings": {
    "index": {
      "number_of_shards": 1,
      "index.codec": "best_compression"
    }
  }
}

Why it works: Each shard consumes memory for its state and data structures. Reducing the number of shards per node lowers the overall memory overhead of managing these shards.

The next error you might encounter is OutOfMemoryError: Java heap space if the JVM heap is too small even after addressing circuit breaker issues.

Want structured learning?

Take the full Elasticsearch course →