question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Method to wait for healthy cluster, that can be called before ES connection exists

See original GitHub issue

This code:

  es = Elasticsearch(["elasticsearch:9200"])
  es.cluster.health(wait_for_status='yellow')

Fails with:

index_test_data_1  | Traceback (most recent call last):
index_test_data_1  |   File "indexer.py", line 74, in <module>
index_test_data_1  |     main()
index_test_data_1  |   File "indexer.py", line 55, in main
index_test_data_1  |     es = init_elasticsearch()
index_test_data_1  |   File "indexer.py", line 34, in init_elasticsearch
index_test_data_1  |     es.cluster.health(wait_for_status='yellow')
index_test_data_1  |   File "/usr/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 76, in _wrapped
index_test_data_1  |     return func(*args, params=params, **kwargs)
index_test_data_1  |   File "/usr/local/lib/python2.7/site-packages/elasticsearch/client/cluster.py", line 33, in health
index_test_data_1  |     'health', index), params=params)
index_test_data_1  |   File "/usr/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 314, in perform_request
index_test_data_1  |     status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
index_test_data_1  |   File "/usr/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 175, in perform_request
index_test_data_1  |     raise ConnectionError('N/A', str(e), e)
index_test_data_1  | elasticsearch.exceptions.ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7fa33e09f110>: Failed to establish a new connection: [Errno 111] Connection refused) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7fa33e09f110>: Failed to establish a new connection: [Errno 111] Connection refused)

I use docker-compose to start index_test_data (Python script) and elasticsearch at the same time. The ConnectionError is because nothing is listening on port 9200 yet.

I would like a method like this:

es.wait_for_status('yellow')
es = Elasticsearch(["elasticsearch:9200"])

That will swallow the ConnectionError’s, and block until elasticsearch is up.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7 (2 by maintainers)

github_iconTop GitHub Comments

24reactions
fxdgearcommented, Apr 25, 2018

Thanks for the issue @melissachang, and I’m sorry to hear you’re having some troubles.

This is something other people have come across as well (please see https://github.com/elastic/elasticsearch-py/issues/715). But building this into the client is not something I feel would add any value. If someone configures their ES instances incorrectly, we’ll end up swallowing errors and nothing will get reported. (even if we built a timeout into it, we have to wait till the timeout expires before the application reports that the cluster is no reachable).

I feel that this is something that should be solved in the application code that implements the Elasticsearch client and not something that should be built into the client itself.

The standard pattern for connecting to any service in docker is to build a “wait” into the application code that waits for a service to start before continuing. Please reference the Docker documentation on controlling startup order.

I have in the past created this bash script which will wait for elasticsearch before starting the command in your container:

#!/bin/bash

set -e

host="$1"
shift
cmd="$@"


until $(curl --output /dev/null --silent --head --fail "$host"); do
    printf '.'
    sleep 1
done

# First wait for ES to start...
response=$(curl $host)

until [ "$response" = "200" ]; do
    response=$(curl --write-out %{http_code} --silent --output /dev/null "$host")
    >&2 echo "Elastic Search is unavailable - sleeping"
    sleep 1
done


# next wait for ES status to turn to Green
health="$(curl -fsSL "$host/_cat/health?h=status")"
health="$(echo "$health" | sed -r 's/^[[:space:]]+|[[:space:]]+$//g')" # trim whitespace (otherwise we'll have "green ")

until [ "$health" = 'green' ]; do
    health="$(curl -fsSL "$host/_cat/health?h=status")"
    health="$(echo "$health" | sed -r 's/^[[:space:]]+|[[:space:]]+$//g')" # trim whitespace (otherwise we'll have "green ")
    >&2 echo "Elastic Search is unavailable - sleeping"
    sleep 1
done

>&2 echo "Elastic Search is up"
exec $cmd

You can use this script in your Dockerfile with:

CMD ["/code/wait-for-elasticsearch.sh", "http://elasticsearch:9200", "--", "binary", "command", "sub-command"]

or similarly you can use this script in your docker-compose.yml by overriding the command in a similar fashion.

7reactions
Ocramiuscommented, Feb 11, 2020

Just contributing back here, since I needed a script to monitor elasticsearch startup, but with a timeout:

#!/usr/bin/env php
<?php

declare(strict_types=1);

namespace WaitForElasticsearch;

use InvalidArgumentException;
use UnexpectedValueException;
use function curl_close;
use function curl_exec;
use function curl_getinfo;
use function curl_init;
use function error_log;
use function getenv;
use function is_string;
use function microtime;
use function sprintf;
use function usleep;
use const CURLINFO_HTTP_CODE;
use const CURLOPT_HEADER;
use const CURLOPT_RETURNTRANSFER;
use const CURLOPT_TIMEOUT_MS;
use const CURLOPT_URL;

// Note: this is a dependency-less file that only relies on ext-curl to function. We do not want any dependencies in
//       here, since the system may not yet be in functional state at this stage.
(static function () : void {
    $elasticsearch = getenv('ELASTICSEARCH_URL');

    if (! is_string($elasticsearch)) {
        throw new InvalidArgumentException('Missing "ELASTICSEARCH_URL" environment variable');
    }

    $timeLimit     = (float) (getenv('ELASTICSEARCH_WAIT_TIMEOUT_SECONDS') ?: 60.0);
    $retryInterval = (int) ((((float) getenv('ELASTICSEARCH_RETRY_INTERVAL_SECONDS')) ?: 0.5) * 1000000);
    $start         = microtime(true);
    $elapsedTime   = static function () use ($start) : float {
        return microtime(true) - $start;
    };
    $remainingTime = static function () use ($elapsedTime, $timeLimit) : float {
        return $timeLimit - $elapsedTime();
    };

    while ($remainingTime() > 0) {
        $curl = curl_init();

        // @see https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-health.html
        curl_setopt($curl, CURLOPT_HEADER, 0);
        curl_setopt($curl, CURLOPT_TIMEOUT_MS, 500);
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
        curl_setopt(
            $curl,
            CURLOPT_URL,
            $elasticsearch . sprintf('/_cluster/health?wait_for_status=yellow&timeout=%ds', (int) $timeLimit)
        );

        $response = curl_exec($curl);

        $errorCode    = curl_errno($curl);
        $errorMessage = curl_error($curl);
        $statusCode   = curl_getinfo($curl, CURLINFO_HTTP_CODE);

        curl_close($curl);

        if ($errorCode === 0) {
            /** @noinspection ForgottenDebugOutputInspection */
            error_log(sprintf('ElasticSearch connection succeeded after %.2f seconds', $elapsedTime()));

            if ($statusCode === 200) {
                /** @noinspection ForgottenDebugOutputInspection */
                error_log(sprintf(
                    'ElasticSearch status is (at least) yellow after %.2f seconds with response: %s',
                    $elapsedTime(),
                    $response
                ));

                return;
            }

            /** @noinspection ForgottenDebugOutputInspection */
            error_log(sprintf(
                'ElasticSearch status is pending after %.2f seconds with response code %d',
                $elapsedTime(),
                $statusCode
            ));
        }

        if ($errorCode !== 0) {
            /** @noinspection ForgottenDebugOutputInspection */
            error_log(sprintf(
                'Failed to contact ElasticSearch: curl error "%s", code %d, retrying for another %.2f seconds',
                $errorMessage,
                $errorCode,
                $remainingTime()
            ));
        }

        usleep($retryInterval);
    }

    throw new UnexpectedValueException(sprintf('Failed to connect to Elasticsearch after %.2f seconds', $elapsedTime()));
})();

Feel free to grab, butcher or burn it with fire 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Cluster health API | Elasticsearch Guide [8.5] | Elastic
The cluster health API returns a simple status on the health of the cluster. ... For example, the following will wait for 50...
Read more >
API Documentation — Elasticsearch 7.16.0 documentation
Elasticsearch low-level client. Provides a straightforward mapping from Python to ES REST endpoints. The instance has attributes cat , cluster , ...
Read more >
How to check if ElasticSearch index exists and is ready?
You can pass a query string param wait_for_status=green that will wait until the cluster is in the given status (or until the timeout ......
Read more >
How to Resolve Unassigned Shards in Elasticsearch - Datadog
In Elasticsearch, a healthy cluster is a balanced cluster: ... process), and the node left the cluster before the data could be replicated....
Read more >
Troubleshooting Amazon OpenSearch Service
Go to the Cluster health tab and find the Total nodes metric. See if the reported number of nodes is fewer than the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found