Default actor resources
See original GitHub issueWhat is the problem?
Ray documentation clearly indicates that by default, when not specified explicitly, ray actors will use 1 cpu resource (https://docs.ray.io/en/master/actors.html#resources-with-actors). However it seems like the default is actually 0.
Ray version and other system information (Python version, TensorFlow version, OS):
I saw this behavior on Ray 0.8.4, 1.0.0, 1.1.0, and 2.0.0dev.
I’m using Python 3.7.1 and 3.8.5 on Ubuntu 16.04.
Reproduction (REQUIRED)
import ray
ray.init(num_cpus=1)
@ray.remote
class foo:
def f(self):
return 1
a = foo.remote()
b = foo.remote() # There should not be enough resources for this
print (ray.get(a.f.remote()))
print (ray.get(b.f.remote())) # This should fail (block indefinitely)
Based on the documentation, I would expect that there would be a warning that an actor could not be scheduled and is pending , and the second ray.get will block forever. However, this is not the case – code terminates, showing both actors were constructed even though total cpu resources was limited to 1.
In contrast, with explicit num_cpus parameter, the code behaves as expected.
import ray
ray.init(num_cpus=1)
@ray.remote(num_cpus=1)
class foo:
def f(self):
return 1
a = foo.remote()
b = foo.remote() # There should not be enough resources for this
print (ray.get(a.f.remote()))
print (ray.get(b.f.remote())) # This should fail (block indefinitely)
- I have verified my script runs in a clean environment and reproduces the issue.
- I have verified the issue also occurs with the latest wheels.
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (7 by maintainers)
Yeah this is a little subtle.
ray.remote
andray.remote(num_cpus=1)
are not the same.ray.remote
means the actor uses CPUs whenever it runs anything (the constructor, a function, etc), but it frees the CPU when it’s not using it.ray.remote(num_cpus=1)
means it just always keeps the CPU.The underlying actor creation task always requires a CPU, but hopefully as an end user, you don’t need to think about that :p
We can close the issue then right?