question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Raylet Memory usage growing

See original GitHub issue

System information

  • OS Platform and Distribution: Ubuntu 16.04
  • Ray installed from: binary
  • Ray version: 0.6.5
  • Python version:3.6.8
  • Exact command to reproduce
import numpy as np
import ray
import time
ray.init(redis_max_memory=100000000)


@ray.remote
class Runner():
    def __init__(self, dataList):
        self.run(dataList)

    def run(self,dataList):
        while True:
            dataList.put.remote(np.ones(10))

@ray.remote
class Optimizer():
    def __init__(self, dataList):
        self.optimize(dataList)

    def optimize(self,dataList):
        while True:
            dataList.pop.remote()

@ray.remote
class DataServer():
    def __init__(self):
        self.dataList= []

    def put(self,data):
        self.dataList.append(data)

    def pop(self):
        if len(self.dataList) !=0:
            return self.dataList.pop()
    def get_size(self):
        return len(self.dataList)


dataServer = DataServer.remote()
runner = Runner.remote(dataServer)
optimizer1 = Optimizer.remote(dataServer)
optimizer2 = Optimizer.remote(dataServer)

while True:
    time.sleep(1)
    print(ray.get(dataServer.get_size.remote()))

Describe the problem

The memory usage is constantly increasing even though the list is cleared repeatedly until the program crashes. Does anybody know why this is happening ?

Source code / logs

Traceback (most recent call last): File “/home/test.py”, line 48, in <module>

File “/home/anaconda3/envs/py3/lib/python3.6/site-packages/ray/worker.py”, line 2310, in get value = worker.get_object([object_ids])[0] File “/home/anaconda3/envs/py3/lib/python3.6/site-packages/ray/worker.py”, line 531, in get_object self.current_task_id, File “python/ray/_raylet.pyx”, line 254, in ray._raylet.RayletClient.fetch_or_reconstruct File “python/ray/_raylet.pyx”, line 59, in ray._raylet.check_status Exception: [RayletClient] Connection closed unexpectedly.

Process finished with exit code 1

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
ericlcommented, Apr 21, 2019

Thanks @KemalAcar07 , I can confirm the result. I see the raylet process also increasing in memory, though the other processes are stable in memory usage.

At first glance, this may be related to this: https://github.com/ray-project/ray/issues/4359, but will require more investigation. cc @robertnishihara

0reactions
simon-mocommented, Mar 19, 2020

Should be fixed by direct task call and direct actor call re-architecture. please reopen if this still fails on latest master/release.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Ray Actor RAM usage keep growing
hi, I am using Ray 1.3 for scaling up my rl codes, when my code runs several tens of minutes, the RAM of...
Read more >
Analyzing memory management and performance in Dask-on ...
In this blog, we explain how we think Ray can make this simpler. We hope this will inspire greater use of Ray as...
Read more >
How do I resolve localRayletDiedError when using Modin with ...
From the raylet logs, it looks like an out of memory error occurred. ... (raylet.exe) object_store.cc:35: Object store current usage 8e-09 ...
Read more >
Ray Tune Increasing Memory Usage - ADocLib
Not subtracting SHR will result in double counting memory usage.Object store memory: ... Raylet: memory used by the C++ raylet process running on...
Read more >
Getting started with Ray in Python! - Deepnote
To turn a standard function into a ray task we will use the decorator ray.remote now ... The present building was erected in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found