Connection reset exceptions while reading data
See original GitHub issueRunning spark on kubernetes, spark-snowflake 2.4.11-spark_2.4
.
Seeing frequent “connection reset” stack traces when reading queried data. The query (identified by tag) shows successful in the snowflake query history, so this appears to be happening after the query/unload, when the client downloads the result.
This spark job does several identical queries with differing parameters. Most succeed but some die like this. Sometimes a retry works, sometimes it fails a few times in a row. Any ideas? This seems to be the root cause of these failures, but it’s not clear to me why it’s happening - whether it’s a spark-snowflake, snowflake, or aws s3 issue.
2019-01-01 08:16:44 WARN TaskSetManager:66 - Lost task 155.3 in stage 6340.0 (TID 454070, 100.121.128.10, executor 29): java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593)
at sun.security.ssl.InputRecord.read(InputRecord.java:532)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:940)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at net.snowflake.client.jdbc.internal.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at net.snowflake.client.jdbc.internal.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at net.snowflake.client.jdbc.internal.apache.http.impl.io.SessionInputBufferImpl.read(SessionInputBufferImpl.java:206)
at net.snowflake.client.jdbc.internal.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:176)
at net.snowflake.client.jdbc.internal.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:135)
at net.snowflake.client.jdbc.internal.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
at net.snowflake.client.jdbc.internal.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180)
at net.snowflake.client.jdbc.internal.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
at net.snowflake.client.jdbc.internal.amazonaws.services.s3.internal.S3AbortableInputStream.read(S3AbortableInputStream.java:125)
at net.snowflake.client.jdbc.internal.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
at net.snowflake.client.jdbc.internal.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
at net.snowflake.client.jdbc.internal.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
at net.snowflake.client.jdbc.internal.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180)
at java.security.DigestInputStream.read(DigestInputStream.java:161)
at net.snowflake.client.jdbc.internal.amazonaws.services.s3.internal.DigestValidationInputStream.read(DigestValidationInputStream.java:59)
at net.snowflake.client.jdbc.internal.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:82)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at javax.crypto.CipherInputStream.getMoreData(CipherInputStream.java:121)
at javax.crypto.CipherInputStream.read(CipherInputStream.java:246)
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:238)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
at net.snowflake.spark.snowflake.io.SFRecordReader.readChar(SFRecordReader.scala:167)
at net.snowflake.spark.snowflake.io.SFRecordReader.next(SFRecordReader.scala:124)
at net.snowflake.spark.snowflake.io.SFRecordReader.next(SFRecordReader.scala:32)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:191)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-01-01 08:16:44 ERROR TaskSetManager:70 - Task 155 in stage 6340.0 failed 4 times; aborting job
2019-01-01 08:16:44 INFO TaskSchedulerImpl:54 - Removed TaskSet 6340.0, whose tasks have all completed, from pool
2019-01-01 08:16:44 INFO TaskSchedulerImpl:54 - Cancelling stage 6340
2019-01-01 08:16:44 INFO TaskSchedulerImpl:54 - Killing all running tasks in stage 6340: Stage cancelled
2019-01-01 08:16:44 INFO DAGScheduler:54 - ShuffleMapStage 6340 (map at sorted_event_arrays.scala:17) failed in 1982.776 s due to Job aborted due to stage failure: Task 155 in stage 6340.0 failed 4 times, most recent failure: Lost task 155.3 in stage 6340.0 (TID 454070, 100.121.128.10, executor 29): java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
..
Issue Analytics
- State:
- Created 5 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
Connection reset error when reading data from client java
More commonly, it is caused by writing to a connection that the other end has already closed normally. In other words an application...
Read more >How to Fix with java.net.SocketException: Connection reset ...
The java.net.SocketException: Connection reset error usually comes when one of the parties in TCP connection like client or server is trying to read/write...
Read more >How does java net SocketException Connection reset ...
Most common issue for this problem occurring is when you close the socket, and then write more data on the output stream. By...
Read more >How does java net SocketException Connection reset ...
The javanet socket exception states that it is thrown to indicate that there is an error in the underlying protocol such as a...
Read more >How to Fix java.net.SocketException: Failed to read from ...
In general, this can come at both client and server end of a client-server Java application which is using TCP/IP to connect each...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@fdosani Please contact Snowflake Customer Support. It is a server side issue. They will help you to fix it. Thank you.
Using
2.4.3
spark-snowflake_2.11-2.5.4-spark_2.4
snowflake-jdbc-3.9.1.jar
For context I’ve tried on both standalone and EMR cluster too with similar issues.