pyarrow._plasma.ObjectNotAvailable when passing multiple copies of the same object to a taks
See original GitHub issueWhat is the problem?
Ray no longer supports passing multiple copies of the same ObjectID to a task as of 0.8.1. This works in 0.8.0. In Modin we often reuse partitions and data to reduce our memory footprint. This issue is significant and we cannot update our Ray dependency until it is resolved.
Ray version and other system information (Python version, TensorFlow version, OS): 0.8.1+
Reproduction (REQUIRED)
Please provide a script that can be run to reproduce the issue. The script should have no external library dependencies (i.e., use fake or mock data / environments):
import ray
ray.init()
a = ray.put(1)
@ray.remote
def f(a, b):
return a + b
ray.get(f.remote(a, a))
Traceback produced:
---------------------------------------------------------------------------
RayTaskError(TypeError) Traceback (most recent call last)
<ipython-input-8-fa22ff5137ff> in <module>
----> 1 ray.get(f.remote(z, z))
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ray/worker.py in get(object_ids, timeout)
1502 worker.core_worker.dump_object_store_memory_usage()
1503 if isinstance(value, RayTaskError):
-> 1504 raise value.as_instanceof_cause()
1505 else:
1506 raise value
RayTaskError(TypeError): ray::__main__.f() (pid=42163, ip=192.168.42.4)
File "python/ray/_raylet.pyx", line 452, in ray._raylet.execute_task
File "<ipython-input-6-8ee2d2129fdd>", line 3, in f
TypeError: unsupported operand type(s) for +: 'int' and 'type'
The second object is of type pyarrow._plasma.ObjectNotAvailable
cc @gshimansky cc @simon-mo
- I have verified my script runs in a clean environment and reproduces the issue.
- I have verified the issue also occurs with the latest wheels.
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
The Plasma In-Memory Object Store — Apache Arrow v10.0.1
This works with all Python objects supported by the Arrow Python object serialization. You can also get multiple objects at the same time...
Read more >pyarrow.plasma.PlasmaClient — Apache Arrow v10.0.1
DEPRECATED: The PlasmaClient is used to interface with a plasma store and manager. The PlasmaClient can ask the PlasmaStore to allocate a new...
Read more >Plasma In-Memory Object Store - Apache Arrow
Plasma holds immutable objects in shared memory so that they can be accessed efficiently by many clients across process boundaries. In light of ......
Read more >pyarrow.plasma.PlasmaBuffer — Apache Arrow v10.0.1
pyarrow.plasma. ... We define our own class instead of directly returning a buffer object so ... Determine if two buffers contain exactly the...
Read more >Streaming, Serialization, and IPC — Apache Arrow v10.0.1
In pyarrow we are able to serialize and deserialize many kinds of Python objects. As an example, consider a dictionary containing NumPy arrays:....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@simon-mo The function signature will have to change in your example as well, and the new task will have to do a
ray.get
on the list passed in.Thanks @edoakes and @simon-mo!