Your Elasticsearch cluster is screaming for a reindex, but the thought of downtime makes you break out in a cold sweat.

The most surprising thing about reindexing Elasticsearch data without downtime is that it’s not a single operation, but a carefully orchestrated dance between your existing index and a new one, with a bit of clever alias manipulation.

Let’s see this in action. Imagine you have an index named my-data-v1 and you need to update its mapping or change its settings. You can’t just alter my-data-v1 directly. Instead, you’ll create a brand new index, say my-data-v2, with the desired schema.

PUT /my-data-v2
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1
  },
  "mappings": {
    "properties": {
      "title": { "type": "text" },
      "content": { "type": "text" },
      "timestamp": { "type": "date" },
      "new_field": { "type": "keyword" }
    }
  }
}

Now, you need to copy data from my-data-v1 to my-data-v2. This is where the _reindex API shines. You can kick this off as a background task:

POST /_reindex?wait_for_completion=false
{
  "source": {
    "index": "my-data-v1"
  },
  "dest": {
    "index": "my-data-v2"
  }
}

While this is happening, your application is still reading from my-data-v1. To facilitate this, you’ll use an alias, let’s call it my-data, that currently points to my-data-v1. Your application’s search queries look like this: GET /my-data/_search.

Once the _reindex task is complete (you can monitor its progress with the task ID returned by the _reindex call, or by checking GET /_tasks/<task_id>), you atomically swap the alias. This is the magic moment where downtime is avoided.

POST /_aliases
{
  "actions": [
    { "remove": { "index": "my-data-v1", "alias": "my-data" } },
    { "add": { "index": "my-data-v2", "alias": "my-data" } }
  ]
}

Immediately after this alias switch, any new search requests hitting my-data will go to my-data-v2. Your application experiences zero interruption. You can then delete the old index my-data-v1 at your leisure.

The core problem this solves is the immutability of index settings and mappings once an index is created. You can’t change the number_of_shards or add a new keyword field to an existing index. Reindexing allows you to create a new index with the desired configuration and then seamlessly transition your application’s access to it.

Internally, the _reindex API is essentially a powerful, distributed scroll and index operation. It efficiently fetches documents from the source index in batches, transforms them if necessary (though in this basic scenario, we’re just copying), and then indexes them into the destination index. The wait_for_completion=false flag is crucial for long-running reindexes, allowing you to trigger it and continue with other operations while it runs in the background.

A common pitfall is forgetting to update all your applications or services to use the alias instead of the direct index name. If any part of your system is still querying my-data-v1 directly, it will miss the new data and the alias switch won’t help it. Always ensure your aliases are the single source of truth for your application’s access patterns.

The _reindex API supports transformations, allowing you to modify documents on the fly as they are copied. For example, you could add a new field based on existing data or change data types.

POST /_reindex?wait_for_completion=false
{
  "source": {
    "index": "my-data-v1"
  },
  "dest": {
    "index": "my-data-v2"
  },
  "script": {
    "source": "ctx._source.new_field = ctx._source.title.toLowerCase() + '_processed';"
  }
}

This allows for much more complex schema evolution and data migration scenarios without needing to reindex, then reprocess, then reindex again.

After successfully reindexing and switching your alias, the next immediate challenge is often dealing with the stale, old index. You’ll want to delete it, but you need to be absolutely certain that all data has been successfully migrated and that no processes are still referencing it. A simple DELETE /my-data-v1 is the final step.

Want structured learning?

Take the full Elasticsearch course →