How to copy an Elasticsearch index from the production to a local Docker container

Written by JoliCode / Original link on Sep. 22, 2020

I faced an issue with Elasticsearch last week, and in order to reproduce it, I wanted to have the full index on my development machine.

To do that, I have some options:

Our setup

In production

In development

The reindex API

I will use the reindex API.

This API allows us to copy an index to another index. And what is cool, is that it allows us to copy data from a remote cluster.

The syntax is something similar to:

POST _reindex
  "source": {
    "remote": {
      "host": "http://otherhost:9200",
      "username": "user",
      "password": "pass"
    "index": "my-index-000001"
  "dest": {
    "index": "my-new-index-000001"

We will use the development node to initiate the reindex. It means we will run the HTTP request on the Docker container.

How to expose the production to a local container

Since there are many security to pass through, we will use an SSH tunnel to expose the production cluster to the local container.

So we need to:

  1. open an SSH connection to the production cluster: keys HostName and User;
  2. by using the bastion: key ProxyJump;
  3. configure the bastion for the "proxy jump": second part of the config;
  4. bind port 9200 (on the production) to 9201 (on our host): key LocalForward;
  5. bind on our host, instead of to allow our container to reach the tunnel: key GatewayPorts;
  6. disable TTY because it's not needed: key RequestTTY;
  7. display a nice message when opening the connection: key RemoteCommand;
  8. use a nice name for the connection: main key Host.

All the configurations together:

# .ssh/config
Host project-prod-tunnel-es
    ProxyJump project-prod-bastion-1
    User debian
    LocalForward 9201
    RequestTTY no
    GatewayPorts true
    RemoteCommand echo "curl" && cat

Host project-prod-bastion-1
    User debian

Note: Our elasticsearch nodes are not listening to, but the local IP. That's why the LocalForward uses and not

WARNING: When you use LocalForward you are opening the big security breach in your production cluster: all computers in your network, containers on your computer, applications will be able to reach the production. This risk should be treated very carefully!

You will also need to have your SSH key installed on your servers. And to open the tunnel, you must run the following command:

ssh project-prod-tunnel-es

How to start the reindex

Configuration of the local node

The remote reindex server should be "whitelisted" on your elasticsearch configuration. The remote host will be your host Docker IP. It's usually the container gateway. You can find it with the following command:

docker inspect -f '{{range .NetworkSettings.Networks}}{{.Gateway}}{{end}} <container_id>'

Once you get the IP, you must allow it in the configuration

# /etc/elasticsearch/elasticsearch.yml
reindex.remote.whitelist: ""

Don't forget to rebuild & up again the container

Start the reindex

Once you are done, you can execute the following HTTP request to start the task

POST _reindex?wait_for_completion=false
  "source": {
    "index": "my_index_to_debug",
    "remote": {
      "host": ""
    "size": 10
  "dest": {
    "index": "my_index_to_debug"

This request will return a task ID. In my case: fLDgREJ0S46ETKfmPnRtHw:7330



With very few configuration (1 line in the Dockerfile and few lines in our .ssh/config) we manage to call the production cluster from a development container.

This blog post was about Elasticsearch, but It would be exactly the same for other TCP services, like PostgreSQL, RabbitMQ, or Redis.

And don't forget to close your tunnel once you’re done!

frankdejonge jolicode

« New in Symfony 5.2: True colors in the console - How to delete 900 million records in MySQL without shooting yourself in the foot »