question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Intermittent SSL socket connection failure

See original GitHub issue

Neo4j Java driver version: 1.0.4 Neo4j server version: 3.0.3

We are using a 3-node HA cluster setup in AWS with 2 ELBs; one for read operations pointing to the slave nodes, and one for write operations pointing to the master node. The ELBs are configured to use the Neo4j management end-points for health checks and to fail over when one of the nodes goes down and the ‘master’ moves. The ELBs are also configured to pass SSL traffic to the back-end servers, so SSL termination is done on the Neo4j server instances.  Our application code has Neo4j Driver object instances for read and write operations that connect to the corresponding ELB instance using the BOLT protocol and requiring encryption.

The problem we are having is periodic failure by the Neo4j Driver to establish an SSL connection.  It seems that after some period of inactivity, a request to read something from the graph results in a failure to establish an SSL connection.  Issuing the same request again succeeds.

Here is the relevant stack trace:

org.neo4j.driver.v1.exceptions.ClientException: Failed to establish SSL socket connection. at org.neo4j.driver.internal.connector.socket.TLSSocketChannel.unwrap(TLSSocketChannel.java:179) at org.neo4j.driver.internal.connector.socket.TLSSocketChannel.read(TLSSocketChannel.java:374) at org.neo4j.driver.internal.connector.socket.BufferingChunkedInput.readNextPacket(BufferingChunkedInput.java:408) at org.neo4j.driver.internal.connector.socket.BufferingChunkedInput.readChunkSize(BufferingChunkedInput.java:344) at org.neo4j.driver.internal.connector.socket.BufferingChunkedInput.read(BufferingChunkedInput.java:246) at org.neo4j.driver.internal.connector.socket.BufferingChunkedInput.fillScratchBuffer(BufferingChunkedInput.java:215) at org.neo4j.driver.internal.connector.socket.BufferingChunkedInput.readByte(BufferingChunkedInput.java:109) at org.neo4j.driver.internal.packstream.PackStream$Unpacker.unpackStructHeader(PackStream.java:441) at org.neo4j.driver.internal.messaging.PackStreamMessageFormatV1$Reader.read(PackStreamMessageFormatV1.java:397) at org.neo4j.driver.internal.connector.socket.SocketClient.receiveOne(SocketClient.java:130) at org.neo4j.driver.internal.connector.socket.SocketClient.receiveAll(SocketClient.java:124) at org.neo4j.driver.internal.connector.socket.SocketConnection.receiveAll(SocketConnection.java:121) at org.neo4j.driver.internal.connector.socket.SocketConnection.sync(SocketConnection.java:100) at org.neo4j.driver.internal.connector.ConcurrencyGuardingConnection.sync(ConcurrencyGuardingConnection.java:122) at org.neo4j.driver.internal.pool.PooledConnection.sync(PooledConnection.java:144) at org.neo4j.driver.internal.InternalSession.close(InternalSession.java:130)

Here are relevant code snippets:

Driver neo4jReadDriver = GraphDatabase.driver(serverURI,
            AuthTokens.basic(username, password),
            Config.build()
                    .withEncryptionLevel(Config.EncryptionLevel.REQUIRED)
                    .toConfig());

private StatementResult run(Driver neo4jDriver, String statementTemplate, Map<String, Object> statementParameters) {
    try (Session neo4jSession = neo4jDriver.session()) {
        return neo4jSession.run(statementTemplate, statementParameters);
    }
}

String cypherStatement = "<cypher>";
HashMap<String, Object> params = new HashMap<>();
StatementResult result = run(neo4jReadDriver, cypherStatement, params);

The SSL connection failure happens at the end of the ‘try’ block when the session is closed. An immediate re-try of the same call succeeds.

Are there any recommended configuration settings for using the Neo4j driver with AWS ELBs? Have the Neo4j drivers been tested in HA configurations using AWS and ELBs? Are there any recommended configuration settings when deploying into AWS and using ELBs?

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:15 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
lutovichcommented, May 2, 2017

Hello,

I just want to update this old ticket with references to couple new APIs which might be helpful here:

  1. Connection liveness check timeout configuration setting: https://github.com/neo4j/neo4j-java-driver/blob/1.2.0/driver/src/main/java/org/neo4j/driver/v1/Config.java#L291-L317. This should force driver re-acquire connection when set to values less than load balancer idle connection timeout.
  2. Transaction function APIs that allow retries with exponential backoff. More info can be found in docs: https://neo4j.com/docs/developer-manual/current/drivers/sessions-transactions/#driver-transactions-transaction-functions.

Hope this helps.

1reaction
edstovercommented, Nov 7, 2016

@tavolate I posted my retry logic above. In my case, I encapsulated this code in a base class used by all of my Neo4j data access classes so that all Cypher queries are executed with retry capability.

Read more comments on GitHub >

github_iconTop Results From Across the Web

8 Ways to Fix SSL Connection Errors on Various Browsers ...
5. How to Fix the SSL Connection Error on Android · Open the Chrome browser and access its Settings menu. · Go to...
Read more >
How to debug and fix intermittent SSL 'connection reset by ...
Well written socket code will alway retry a few times before failing. This is very hard to debug becuase it is often not...
Read more >
SSL Handshake Failures - Baeldung
A focused tutorial on SSL handshake failures and how to fix them. ... In Java, we can use sockets to establish a communication...
Read more >
Rehash: How to Fix the SSL/TLS Handshake Failed Error
The fastest way to fix this SSL/TLS handshake error-causing issue is just to reset your browser to the default settings and disable all...
Read more >
Intermittent handshake Failure going to 1 remote TP using AS2
So if the protocol versions vary between peers, the SI http client the SSL connection fails. The solution was to make the peer...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found