Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Transport response handler not found

See original GitHub issue

CrateDB version: 4.1.2

Environment description:

Centos 7.latest OpenJDK 11.0.7 Data nodes: 8cpu, 64gb ram Node makeup: 48 data, 3 master, 2 ingest, 2 query While we have 48 data nodes, it’s essentially over 2 availability zones (24 per zone)

Problem description: Our cluster health will occasionally get stuck in yellow and will require us to restart crate on the affected nodes for the health to go back to green. We typically have a nagios check that runs the alter cluster command which ends up resolving the problem, however, there are cases that require manual intervention.

We typically see shards stay unassigned until we run ALTER CLUSTER REROUTE RETRY FAILED. Some logs from a related issue #9748

shard has exceeded the maximum number of retries [20] on failed allocation attempts - manually execute 'alter cluster....' [unassigned_info[[reason=ALLOCATION_FAILED], at ..... failed to create shard, failure IOException[failed to obtain in-memory shard lock]...


[WARN ][o.e.i.c.IndicesClusterStateService] [hostname][[namespace..partitioned.tablename.someuuid][1]] marking and sending shard failed due to [failed to create shard] java.io.IOException: failed to obtain in-memory shard lock
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:358)
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:440)
    at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:112)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:551)
...


[INFO ][o.e.i.s.TransportNodesListShardStoreMetaData] [hostname][namespace..partitioned.tablename.someuuid][1]: failed to obtain shard lock
org.elasticsearch.env.ShardLockObtainFailedException: [namespace..partitioned.tablename.someuuid][1]: obtaining shard lock timed out after 5000ms, previous lock details: [shard creation] trying to lock for [read metadata snapshot]
    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:748)
    at org.elasticsearch.NodeEnvironment.shardLock(NodeEnvironment.java:663)
    at org.elasticsearch.index.Store.readMetadataSnapshot(Store.java:443)
....

AFTER running the retry command we get shards stuck in the RELOCATING state with the following log message that emits at a very fast rate:

[WARN ][o.e.t.TransportService][node]Transport response handler not found of id [9285317]

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:17 (8 by maintainers)

Top GitHub Comments

2reactions

seutcommented, Nov 23, 2020

We’ve finally found the issue related to the Transport handler not found ... log entries, see https://github.com/crate/crate/pull/10797. Thank you for reporting, it was indeed an issue.

1reaction

gruselglatzcommented, Sep 14, 2020

@seut I will get it to you via my colleague @rene-stiams .

Top Results From Across the Web

Transport response handler not found of - Elastic Discuss

I have two server, both with logstash 2.4.0 and elasticsearch 2.4.0. I found the following warnings in the elasticsearch log on my master...

Transport response handler not found of id - Opster

This guide will help you check for common problems that cause the log ” Transport response handler not found of id ” to...

Transport response handler not found of id - Stack Overflow

I setuped a web app with org.springframework.boot:spring-boot-starter-data-elasticsearch . Everything work well - I can populate indexes ...

ElasticSearch Server Randomly Stops Working

When I view the indexes, via "ls ...nodes/0/indeces/" it shows all indexes being modified today for some reason and there are new file...

7 Using the Elasticsearch Handler - Oracle Help Center

7.4 Troubleshooting. This section contains information to help you troubleshoot various issues. Transport Client Properties File Not Found. This is applicable ...