The _cat/indices API is failing because the Elasticsearch cluster cannot find the index you’re asking about, or it’s in a state where it’s not yet visible.
Here’s what’s actually broken: a shard for the requested index is either missing or inaccessible by the coordinating node, preventing the _cat/indices API from gathering the necessary metadata. This usually means a problem with shard allocation, disk space, or network connectivity between nodes.
Common Causes and Fixes:
-
Index Not Yet Created or Typo in Name:
- Diagnosis: Double-check the exact spelling of your index name. Use
GET _cat/indicesto list all available indices and verify. - Fix: Ensure you’re using the correct index name. If the index hasn’t been created yet, run the creation command, e.g.,
PUT my-new-index. - Why it works: The
_cat/indicesAPI queries for specific index metadata. If the index doesn’t exist, the metadata can’t be found.
- Diagnosis: Double-check the exact spelling of your index name. Use
-
Disk Space Full on Data Nodes:
- Diagnosis: Check disk usage on your data nodes. Run
GET _cat/allocation?vand look fordisk.used_percentvalues approaching 90% or higher. Also, checkshardsanddisk.indicesfor specific indices consuming large amounts of space. - Fix: Free up space on the affected data nodes by deleting old indices or increasing disk capacity. If you need to temporarily allow writes despite full disks, you can adjust the
cluster.routing.allocation.disk.watermark.lowandcluster.routing.allocation.disk.watermark.highsettings, but this is a temporary workaround. A more robust solution is to increase disk space or implement a lifecycle management policy. - Why it works: Elasticsearch stops allocating new shards and can even mark existing shards as
unassignedorcorruptedwhen disk space runs critically low to prevent data loss.
- Diagnosis: Check disk usage on your data nodes. Run
-
Shard Allocation Issues (Unassigned Shards):
- Diagnosis: Use
GET _cat/shards?h=index,shard,prirep,state,unassigned.reasonand filter forUNASSIGNEDstates. Theunassigned.reasonwill tell you why. Common reasons includeNO_VALID_SHARD_COPY,ALLOCATION_FAILED, orNODE_LEFT. - Fix:
NO_VALID_SHARD_COPY/ALLOCATION_FAILED: Often due to disk issues (see point 2) or node instability. Restarting the node or ensuring disk health can help. If a node is permanently offline, Elasticsearch may need to reallocate shards from its replicas.NODE_LEFT: If a node has left the cluster and hasn’t rejoined, Elasticsearch will try to reallocate its shards. Ensure the node is back online or that the cluster has enough capacity to handle the loss.- Explicitly enable allocation: If allocation is disabled for maintenance, re-enable it with
PUT _cluster/settings {"persistent": {"cluster.routing.allocation.enable": "all"}}.
- Why it works: Shards must be allocated to a node for the index to be considered active and visible. Unassigned shards mean the cluster is actively trying (or unable) to place them.
- Diagnosis: Use
-
Network Partition or Node Unreachability:
- Diagnosis: Check cluster health with
GET _cluster/health. Look forstatusother thangreenoryellow. Examine logs on your Elasticsearch nodes for network-related errors (e.g.,connection refused,timed out). Usepingortraceroutefrom one node to another to test connectivity. - Fix: Resolve network issues. This could involve firewall rules, DNS resolution problems, or physical network hardware. Ensure all nodes in the cluster can communicate with each other on the transport layer (default port 9300).
- Why it works: Nodes need to communicate to share shard status and manage cluster state. If nodes can’t reach each other, shards might appear missing or be marked as unavailable.
- Diagnosis: Check cluster health with
-
Corrupted Index or Shard Data:
- Diagnosis: Check
GET _cat/indices?vand look for indices withstatusother thanopen. Examine Elasticsearch logs for exceptions related to shard corruption (e.g.,corrupt index,IOError,segment corruption). - Fix: If corruption is detected, you’ll likely need to restore from a snapshot. If you don’t have a snapshot, you might have to delete the index and recreate it, accepting data loss. For minor corruption that prevents recovery, you might try to force shard recovery if possible, but this is advanced and risky.
- Why it works: Corrupted shard files mean Elasticsearch cannot read or write data to that shard, making it unavailable for the index.
- Diagnosis: Check
-
Master Node Instability or Unavailability:
- Diagnosis: Check
GET _cluster/health. If thestatusisredandunassigned_shardsis high, it can indicate master issues. Look for logs on master-eligible nodes indicating they are not being elected or are losing quorum. - Fix: Ensure you have a stable quorum of master-eligible nodes. Restart master-eligible nodes one by one if necessary, ensuring they can re-establish a quorum. Check
discovery.seed_hostsandcluster.initial_master_nodesconfiguration. - Why it works: The master node is responsible for managing cluster state, including shard allocation. If the master is unstable or unavailable, shard allocation decisions cannot be made, leading to unassigned shards and inaccessible indices.
- Diagnosis: Check
After resolving these issues, you might encounter IndexNotFoundException if the index was truly deleted or if you’re querying a different cluster.