[ray] Objects are being evicted improperly
See original GitHub issueSystem information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): local machines
- Ray installed from (source or binary): pip install -U https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.8.0.dev4-cp37-cp37m-manylinux1_x86_64.whl
- Ray version: 0.8.0.dev4
- Python version: 3.7
- Exact command to reproduce:
import ray
import torch
ray.init(object_store_memory=int(100e6))
@ray.remote
def identity(vectors):
return [ray.put(ray.get(vec)) for vec in vectors]
obj_id = ray.put(torch.randn(int(1e5)))
vectors = [obj_id for _ in range(200)]
while True:
vectors = ray.get(identity.remote(vectors))
Describe the problem
This code throws the following error.
2019-09-03 13:46:22,598 WARNING worker.py:1797 -- The task with ID ffffffffffffffffffff01000000 is a driver task and so the object created by ray.put could not be reconstructed.
(pid=38222) 2019-09-03 13:46:23,308 INFO worker.py:432 -- The object with ID ObjectID(7d58f415c89effffffff0100000000c001000000) already exists in the object store.
2019-09-03 13:46:28,320 ERROR worker.py:1737 -- Possible unhandled error from worker: ray_worker (pid=38222, host=atlas)
ray.exceptions.UnreconstructableError: Object ffffffffffffffffffff01000000008002000000 is lost (either LRU evicted or deleted by user) and cannot be reconstructed. Try increasing the object store memory available with ray.init(object_store_memory=<bytes>) or setting object store limits with ray.remote(object_store_memory=<bytes>). See also: https://ray.readthe
docs.io/en/latest/memory-management.html
However, if you replace the definition of obj_id with
obj_id = ray.put(list(range(int(1e5))))
then we get the correct error which @ericl 's recent PR added, or if you replace the definition with obj_id = torch.randn(int(1e5))
:
(pid=46751) 2019-09-03 13:56:21,919 INFO worker.py:2381 -- Put failed since the value was either too large or the store was full of pinned objects. If you are putting and holding references to a lot of object ids, consider ray.put(value, weakref=True) to allow object data to be evicted early.
However, neither error should be raised -we have only 80 MB of objects and the object store has 100 MB capacity.
Issue Analytics
- State:
- Created 4 years ago
- Comments:21 (17 by maintainers)
Top Results From Across the Web
Changes in object store eviction behavior with latest ray-0.7.0 ...
As soon as the object store reaches the eviction threshold, the next ray.get(ray.put(scratch)) iteration will fail. I believe this is because the original ......
Read more >How to clear objects from the object store in ray?
The local object store is full of objects that are still in scope and cannot be evicted. Tip: Use the `ray memory` command...
Read more >Ray Core API — Ray 2.2.0 - the Ray documentation
Remote task and actor objects returned by @ray.remote can also be dynamically ... The object may not be evicted while a reference to...
Read more >Defenses to Eviction - LSNJLAW
A landlord can be fined up to $500 for failing to register. Cite: N.J.S.A. 46:8-35. Improper notice or no notice. You can get...
Read more >Landlord Tenant Law - Evictions
TENANT FORMS & GUIDES. Montana Tenants' Rights & Duties Handbook (29 pages). Improper Notice of Termination Letter (Telling Your Landlord that the Notice...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@ericl it’s not just drivers that I can pin the objects in, right -I could pin an object in an Actor?
Ok great, in this case I would recommend you switch to using actors. Actors are long-lived, so if you pin an object in the actor it will stay there for-ever as long as the actor is holding a reference to it.