question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Connection reset exceptions while reading data

See original GitHub issue

Running spark on kubernetes, spark-snowflake 2.4.11-spark_2.4.

Seeing frequent “connection reset” stack traces when reading queried data. The query (identified by tag) shows successful in the snowflake query history, so this appears to be happening after the query/unload, when the client downloads the result.

This spark job does several identical queries with differing parameters. Most succeed but some die like this. Sometimes a retry works, sometimes it fails a few times in a row. Any ideas? This seems to be the root cause of these failures, but it’s not clear to me why it’s happening - whether it’s a spark-snowflake, snowflake, or aws s3 issue.

2019-01-01 08:16:44 WARN  TaskSetManager:66 - Lost task 155.3 in stage 6340.0 (TID 454070, 100.121.128.10, executor 29): java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(SocketInputStream.java:210)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
	at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593)
	at sun.security.ssl.InputRecord.read(InputRecord.java:532)
	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
	at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:940)
	at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
	at net.snowflake.client.jdbc.internal.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
	at net.snowflake.client.jdbc.internal.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
	at net.snowflake.client.jdbc.internal.apache.http.impl.io.SessionInputBufferImpl.read(SessionInputBufferImpl.java:206)
	at net.snowflake.client.jdbc.internal.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:176)
	at net.snowflake.client.jdbc.internal.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:135)
	at net.snowflake.client.jdbc.internal.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
	at net.snowflake.client.jdbc.internal.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180)
	at net.snowflake.client.jdbc.internal.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
	at net.snowflake.client.jdbc.internal.amazonaws.services.s3.internal.S3AbortableInputStream.read(S3AbortableInputStream.java:125)
	at net.snowflake.client.jdbc.internal.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
	at net.snowflake.client.jdbc.internal.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
	at net.snowflake.client.jdbc.internal.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
	at net.snowflake.client.jdbc.internal.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180)
	at java.security.DigestInputStream.read(DigestInputStream.java:161)
	at net.snowflake.client.jdbc.internal.amazonaws.services.s3.internal.DigestValidationInputStream.read(DigestValidationInputStream.java:59)
	at net.snowflake.client.jdbc.internal.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
	at java.io.FilterInputStream.read(FilterInputStream.java:107)
	at javax.crypto.CipherInputStream.getMoreData(CipherInputStream.java:121)
	at javax.crypto.CipherInputStream.read(CipherInputStream.java:246)
	at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:238)
	at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
	at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
	at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
	at net.snowflake.spark.snowflake.io.SFRecordReader.readChar(SFRecordReader.scala:167)
	at net.snowflake.spark.snowflake.io.SFRecordReader.next(SFRecordReader.scala:124)
	at net.snowflake.spark.snowflake.io.SFRecordReader.next(SFRecordReader.scala:32)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:191)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
	at org.apache.spark.scheduler.Task.run(Task.scala:121)
	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2019-01-01 08:16:44 ERROR TaskSetManager:70 - Task 155 in stage 6340.0 failed 4 times; aborting job
2019-01-01 08:16:44 INFO  TaskSchedulerImpl:54 - Removed TaskSet 6340.0, whose tasks have all completed, from pool
2019-01-01 08:16:44 INFO  TaskSchedulerImpl:54 - Cancelling stage 6340
2019-01-01 08:16:44 INFO  TaskSchedulerImpl:54 - Killing all running tasks in stage 6340: Stage cancelled
2019-01-01 08:16:44 INFO  DAGScheduler:54 - ShuffleMapStage 6340 (map at sorted_event_arrays.scala:17) failed in 1982.776 s due to Job aborted due to stage failure: Task 155 in stage 6340.0 failed 4 times, most recent failure: Lost task 155.3 in stage 6340.0 (TID 454070, 100.121.128.10, executor 29): java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(SocketInputStream.java:210)
..

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
binglihubcommented, Jan 10, 2020

@fdosani Please contact Snowflake Customer Support. It is a server side issue. They will help you to fix it. Thank you.

0reactions
fdosanicommented, Jan 10, 2020

Using

  • Spark 2.4.3
  • spark-snowflake_2.11-2.5.4-spark_2.4
  • and snowflake-jdbc-3.9.1.jar

For context I’ve tried on both standalone and EMR cluster too with similar issues.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Connection reset error when reading data from client java
More commonly, it is caused by writing to a connection that the other end has already closed normally. In other words an application...
Read more >
How to Fix with java.net.SocketException: Connection reset ...
The java.net.SocketException: Connection reset error usually comes when one of the parties in TCP connection like client or server is trying to read/write...
Read more >
How does java net SocketException Connection reset ...
Most common issue for this problem occurring is when you close the socket, and then write more data on the output stream. By...
Read more >
How does java net SocketException Connection reset ...
The javanet socket exception states that it is thrown to indicate that there is an error in the underlying protocol such as a...
Read more >
How to Fix java.net.SocketException: Failed to read from ...
In general, this can come at both client and server end of a client-server Java application which is using TCP/IP to connect each...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found