Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Adding `_max_workers` attribute to `MPIPoolExecutor` subclass

See original GitHub issue

In concurrent.futures, the implementations define an attribute _max_workers for the ThreadPoolExecutor and the ProcessPoolExecutor, which shows how many workers can be spun up with tasks at a time. It would be helpful if MPIPoolExecutor also defined _max_workers so this could be queried on them as well. This can help when deciding how many tasks to dispatch to the Executor or how to batch them.

Issue Analytics

State:
Created 2 years ago
Comments:8 (5 by maintainers)

Top GitHub Comments

1reaction

jakirkhamcommented, Sep 29, 2021

Thanks for the context, Lisandro 🙂 Sorry for the slow reply

Not quite. Perhaps it helps to see my use case?

In Dask we added support somewhat recently for using concurrent.futures-based Executors. By default this will use the ThreadPoolExecutor, but users could switch to processes, which would use the ProcessPoolExecutor. This becomes more interesting when users bring their own Executor to support other environments ( https://github.com/dask/dask/issues/6220 ). As a result we can help users get started with Dask in environments they know well as long as there is an Executor they can leverage.

MPI support is a commonly requested thing. We do have Dask-MPI, which does support launching the Distributed Dask cluster with mpi4py. This may be a bit heavyweight for a user, who is just trying to get started and play around a little bit. Being able to start a MPIPoolExecutor reasonably quickly and plug that into Dask, could be helpful here.

However when scheduling work, it can be useful to know how many workers can be used. This helps us make decisions about how to batch work to an individual worker or how many tasks we can reasonably run. Unfortunately we can’t really get by without this information. For most Executors, we have seen _max_workers works well or can be easily added. So we check for that. However we are left to guess without that information.

So this leaves us with this question, how should we try to support MPIPoolExecutor here? From Dask’s perspective _max_workers would be ideal as this would impose the least effort on users. If executor._pool.size were aliased that way, that would certainly do the trick. Though it sounds like you have reservations with this approach. Within Dask, we could add more specific checks with different Executors, but this likely won’t scale well and we may not be in the best position to maintain these changes as implementations evolve. If neither of those work, we could document this way of getting the number of MPI workers or simply point users to this thread. Then users would run something like the code below (after the startup code you have above):

with dask.config.set(scheduler=executor, num_workers=executor._pool.size):
    x.compute()

Maybe this is an ok place to leave things? This is admittedly a relatively new use case. So it may simply need more users trying things and providing feedback about how they would like things to work. That would hopefully clarify any next steps here that we are unsure of atm

1reaction

dalcinlcommented, Sep 18, 2021

I’m a bit hesitant to implement and support private APIs. Some internal details of mpi4py.futures are a bit different than concurrent.futures. For example, concurrent.futures create workers on-demand, adjusting the number of workers as tasks arrive and up to max_workers. mpi4py.futures asks the MPI runtime to spawn max_workers processes in “soft” mode, than means that the runtime could spawn less processes if the resources are not available. All this is MPI standard semantics (actual support and behavior depends on the backend MPI implementation). Therefore, a _max_workers attribute in mpi4py.futures could be misleading, that’s the reason I do not expose it What you really want is the actual number of MPI worker processes, not the upper bound requested by users.

All that being said, I would like to offer you an alternative.

There is an easy way (albeit still private API) to know the number of MPI workers is the following:

executor = MPIPoolExecutor(...)
executor.bootup()  # block until workers are up and running
num_workers = executor._pool.size  # this is the actual number of MPI workers
executor.submit(...)

We are using this approach for our own testing.

If you want inject a _max_workers attribute for compatibility with concurrent.futures, then it is as easy picking one of the following options:

Use a factory function to create the executor and set the _max_workers attribute in the executor instance.
Subclass MPIPoolExecutor and add a @property getter to get _max_workers.
Monkey-patch the MPIPoolExecutor class by injecting a property getter to get _max_workers.

mpi4py could expose a convenience num_workers attribute (implemented via property getter). But then we need to decide about the semantics (return value) in between the executor is created and the workers and up and running (return None? block until workers are up and return the number of workers?). FYI, creating an executor does not block until the workers have been spawned (that’s the reason for the bootup() call in the code above). Also, we need to decide on the semantics after the executor has shutdown (return None? return the number of workers the executor used while alive? error with exception?).

Top Results From Across the Web

concurrent.futures — Launching parallel tasks — Python 3.11 ...

Changed in version 3.5: Added the chunksize argument. ... ThreadPoolExecutor is an Executor subclass that uses a pool of threads to execute calls ......