Elasticsearch’s Cross-Cluster Search (CCS) lets you query data spread across multiple Elasticsearch clusters as if they were one, but the real magic is how it lets you aggregate that data without pulling it all into a single node.
Let’s say you have two clusters: cluster-us in Virginia and cluster-eu in Frankfurt, both ingesting logs from different regions. You want to see the top 10 most frequent errors across both regions in the last hour.
Here’s how you’d set it up and query it:
First, configure cluster-us to know about cluster-eu. On cluster-us, edit its elasticsearch.yml (typically in /etc/elasticsearch/) and add this to the config section:
cluster:
remote:
cluster-eu:
seeds: eu-node-1:9300,eu-node-2:9300
Replace eu-node-1:9300 and eu-node-2:9300 with the actual transport addresses of nodes in cluster-eu. The 9300 is the default transport port. After saving, restart the Elasticsearch service on cluster-us.
Now, from cluster-us, you can query data on cluster-eu. To query across both, you use a special index name pattern: cluster-eu:logs-*,cluster-us:logs-*. The logs-* assumes your log indices are named like logs-2023-10-27.
To get the top 10 errors globally, you’d run this search from cluster-us:
GET cluster-eu:logs-*,cluster-us:logs-*/_search
{
"size": 0,
"query": {
"range": {
"@timestamp": {
"gte": "now-1h",
"lte": "now"
}
}
},
"aggs": {
"all_errors": {
"terms": {
"field": "message.keyword",
"size": 10,
"min_doc_count": 0
}
}
}
}
When you execute this from cluster-us, Elasticsearch on cluster-us will:
- Parse the index pattern
cluster-eu:logs-*,cluster-us:logs-*. - Identify which indices belong to
cluster-euand which belong tocluster-us. - For each remote cluster (
cluster-euin this case), it sends the query to that cluster’s coordinating node. - For its own cluster (
cluster-us), it executes the query locally. - It then collects the results from all involved clusters.
- Crucially, for aggregations like
terms, the remote cluster performs the initial aggregation on its local data. Elasticsearch then merges these partial results oncluster-usto produce the final, global top 10. This avoids transferring millions of individual log documents over the network.
This distributed aggregation is the key. Without it, you’d have to manually pull data from each cluster into a central one, which would be slow, network-intensive, and likely hit memory limits on the central cluster. CCS handles the orchestration and merging of aggregation results efficiently.
The cluster.remote setting is only needed on the initiating cluster. The remote cluster doesn’t need to know about the initiating cluster. This simplifies setup; you only configure the direction you need to query.
One subtlety often missed is that the size parameter in the terms aggregation applies to the merged results. If you have cluster-us with 5 unique errors and cluster-eu with 8 unique errors, and you ask for size: 10, you’ll get up to 10 unique errors in the final output, correctly merged. The remote cluster might initially return more than 10 terms if its local data warrants it, but the final merge on the coordinating node will cap it.
The next step is often to use this cross-cluster query capability to build centralized dashboards in Kibana, where you can visualize aggregated data from multiple distinct Elasticsearch deployments.