Implement `pipeline.to(device)`
See original GitHub issueCurrently, pipeline modules are moved to the preferred compute device during __call__
. This is reasonable, as they stay there as long as the user keeps passing the same torch_device
across calls.
However, in multi-GPU model-serving scenarios, it could be useful to move each pipeline to a dedicated device during or immediately after instantiation. This would make it possible to create, say, 8 different pipelines and move each one to a different GPU. Doing it this way could potentially save CPU memory while preparing the service.
Currently, the workaround to achieve the same would be to perform a call with fake data immediately after the instantiation.
Describe the solution you’d like Ideally, the following should work:
pipe = StableDiffusionPipeline.from_pretrained(model_id).to("cuda:1")
Describe alternatives you’ve considered Current workaround:
pipe = StableDiffusionPipeline.from_pretrained(model_id)
_ = pipe(["cat"], num_inference_steps=1, torch_device="cuda:1")
Another alternative would be to pass the device to the initializer. This could be done in addition to adding a to
method, but I believe it’s not necessary as to
is familiar enough to PyTorch users.
Additional context See discussion in this Slack thread.
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:5 (5 by maintainers)
Top GitHub Comments
@patil-suraj happy to take it! I’ll do it after making some progress on the backend, unless it’s urgent. I think I’d be ready to work on this later today or tomorrow, would that be ok?
@pcuenca do you wanna take a stab at it ? Otherwise happy to work on it, if you are busy 😃