question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

"Stream is closed"

See original GitHub issue

I am having workers die with the following error messages.

This occurs when I am trying to .persist() large xarray dataset into memory. (There is more than enough memory in the cluster for the dataset by a factor of 5.)

I don’t know what these errors mean, other than that the workers have died. Advice would be appreciated on how to debug more effectively.

distributed.worker - ERROR - failed during get data
Traceback (most recent call last):
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 221, in write
    yield future
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run
    value = future.result()
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
tornado.iostream.StreamClosedError: Stream is closed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/distributed/worker.py", line 524, in get_data
    compressed = yield comm.write(msg)
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run
    value = future.result()
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run
    yielded = self.gen.throw(*exc_info)
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 225, in write
    convert_stream_closed_error(self, e)
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 124, in convert_stream_closed_error
    raise CommClosedError("in %s: %s: %s" % (obj, exc.__class__.__name__, exc))
distributed.comm.core.CommClosedError: in <closed TCP>: ConnectionResetError: [Errno 104] Connection reset by peer
distributed.core - WARNING - Lost connection to 'tcp://10.43.8.25:48251': in <closed TCP>: ConnectionResetError: [Errno 104] Connection reset by peer
distributed.worker - ERROR - Worker stream died during communication: tcp://10.43.4.25:53234
Traceback (most recent call last):
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 182, in read
    frame = yield stream.read_bytes(length)
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run
    value = future.result()
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
tornado.iostream.StreamClosedError: Stream is closed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/distributed/worker.py", line 1763, in gather_dep
    who=self.address)
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run
    value = future.result()
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run
    yielded = self.gen.throw(*exc_info)
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/distributed/core.py", line 516, in send_recv_from_rpc
    result = yield send_recv(comm=comm, op=key, **kwargs)
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run
    value = future.result()
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run
    yielded = self.gen.throw(*exc_info)
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/distributed/core.py", line 350, in send_recv
    response = yield comm.read()
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run
    value = future.result()
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run
    yielded = self.gen.throw(*exc_info)
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 188, in read
    convert_stream_closed_error(self, e)
  File "/rigel/ocp/users/ra2697/conda/envs/pangeo/lib/python3.6/site-packages/distributed/comm/tcp.py", line 126, in convert_stream_closed_error
    raise CommClosedError("in %s: %s" % (obj, exc))
distributed.comm.core.CommClosedError: in <closed TCP>: Stream is closed

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:19
  • Comments:37 (22 by maintainers)

github_iconTop GitHub Comments

3reactions
mrocklincommented, Dec 31, 2017

There is more than enough memory in the cluster for the dataset by a factor of 5.

I think it is something related to spill-to-disk. If I make the dataset smaller by a factor of two, everything works fine.

These two statements together confuse me. Does it spill to disk? Does the worker itself fail?

2reactions
TomNicholascommented, Nov 3, 2020

Just commenting to say that I was having this same issue for a long time (I was trying to do something similar to Ryan (and to Julius over at https://github.com/pangeo-data/pangeo/issues/757), using xarray.open_mfdataset to open and load data from hundreds of netCDF files), but after updating to dask v2.30.0 and dask-labextension v3.0.0 it seems to be working now! I’m very pleased.

Read more comments on GitHub >

github_iconTop Results From Across the Web

java IO Exception: Stream Closed - Stack Overflow
The first line of your stack trace says it all: java.io.IOException: Stream closed . You can't close it and then write to it...
Read more >
IOException : Stream is closed. - Wowza Community
Hi,. The error suggests that the file is not present, perhaps /assets/ folder is not accessable. If you do not provide a 'file'...
Read more >
"Stream has already been operated upon or closed" Exception ...
In this brief article, we're going to discuss a common Exception that we may encounter when working with the Stream class in Java...
Read more >
Java IO exception stream closed - Intellipaat Community
This is the code I currently have: public class FileStatus extends Status{ FileWriter writer; public ... anyone tell me how to resolve this ......
Read more >
Unable to read java.io.IOException: Stream is closed
IOException: Stream is closed. Hi! I'm trying to call this POST API using Postman and getting this error message: Unable to read java.io....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found