How to do inferencing using multiple GPU's for styleformer
See original GitHub issueI am using this model to do inferencing on 1 million data point using A100 GPU's
with 4 GPU
. I am launching a inference.py
code using Googles vertex-ai Container.
How can I make inference code to utilise all 4 GPU’s ? So that inferencing is super-fast.
Here is the same code I use in inference.py:
from styleformer import Styleformer
import warnings
warnings.filterwarnings("ignore")
# style = [0=Casual to Formal, 1=Formal to Casual, 2=Active to Passive, 3=Passive to Active etc..]
sf = Styleformer(style = 1)
import torch
def set_seed(seed):
torch.manual_seed(seed)
if torch.cuda.is_available():
torch.cuda.manual_seed_all(seed)
set_seed(1212)
source_sentences = [
"I would love to meet attractive men in town",
"Please leave the room now",
"It is a delicious icecream",
"I am not paying this kind of money for that nonsense",
"He is on cocaine and he cannot be trusted with this",
"He is a very nice man and has a charming personality",
"Let us go out for dinner",
"We went to Barcelona for the weekend. We have a lot of things to tell you.",
]
for source_sentence in source_sentences:
# inference_on = [0=Regular model On CPU, 1= Regular model On GPU, 2=Quantized model On CPU]
target_sentence = sf.transfer(source_sentence, inference_on=1, quality_filter=0.95, max_candidates=5)
print("[Formal] ", source_sentence)
if target_sentence is not None:
print("[Casual] ",target_sentence)
else:
print("No good quality transfers available !")
print("-" *100)
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:6 (2 by maintainers)
Top Results From Across the Web
Efficient Inference on a Multiple GPUs - Hugging Face
This document contains information on how to efficiently infer on a multiple GPUs. Note: A multi GPU setup can use the majority of...
Read more >Issues · PrithivirajDamodaran/Styleformer - GitHub
Created by Prithiviraj Damodaran. ... Issues · PrithivirajDamodaran/Styleformer. ... How to do inferencing using multiple GPU's for styleformer.
Read more >Use multiple GPUs in TensorFlow to inference with pb model
you can force the device for each node in the graph by: def load_network(graph, i): od_graph_def = tf.GraphDef() with tf.gfile.
Read more >Parallelizing across multiple CPU/GPUs to speed up deep ...
However, you can use Python's multiprocessing module to achieve parallelism by running ML inference concurrently on multiple CPU and GPUs.
Read more >Deep Learning with MATLAB on Multiple GPUs - MathWorks
By using parallel workers with GPUs, you can train with multiple GPUs on your ... For training and inference with multiple GPUs in...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@pratikchhapolika It sounds like you’ll need to fire up a separate process for each GPU and pass in
inference_on=0
,inference_on=1
,inference_on=2
, andinference_on=3
, respectively, usingmultiprocessing
.@PrithivirajDamodaran What I would like to know is how one can batchify Styleformer inference tasks to make efficient use of GPUs that have 48GB or 80GB each.
@PrithivirajDamodaran How’s the batch patch coming along?