New microsoft/bloom-deepspeed-inference-fp16 weights not working with DeepSpeed MII
See original GitHub issueNew microsoft/bloom-deepspeed-inference-fp16
and microsoft/bloom-deepspeed-inference-int8
weights not working with DeepSpeed MII
Traceback (most recent call last):
File "scripts/bloom-inference-server/server.py", line 83, in <module>
model = DSInferenceGRPCServer(args)
File "/net/llm-shared-nfs/nfs/mayank/BigScience-Megatron-DeepSpeed/scripts/bloom-inference-server/ds_inference/grpc_server.py", line 36, in __init__
mii.deploy(
File "/net/llm-shared-nfs/nfs/yelkurdi/conda/miniconda3/envs/llmpt/lib/python3.8/site-packages/mii/deployment.py", line 70, in deploy
mii.utils.check_if_task_and_model_is_valid(task, model)
File "/net/llm-shared-nfs/nfs/yelkurdi/conda/miniconda3/envs/llmpt/lib/python3.8/site-packages/mii/utils.py", line 108, in check_if_task_and_model_is_valid
assert (
AssertionError: text-generation only supports [.....]
The list of models doesn’t contain the new weights.
Issue Analytics
- State:
- Created a year ago
- Comments:12 (6 by maintainers)
Top Results From Across the Web
microsoft/bloom-deepspeed-inference-fp16 - Hugging Face
This is a copy of the original BLOOM weights that is more efficient to use with the DeepSpeed-MII and DeepSpeed-Inference. In this repo...
Read more >DeepSpeed Deep Dive — Model Implementations for ...
The DeepSpeed team has recently released a new open-source library called Model Implementation for Inference (MII), aimed towards making ...
Read more >Deepspeed-mii - Microsoft Research
DeepSpeed -MII is a new open-source python library from DeepSpeed, aimed towards making low-latency, low-cost inference of powerful models not only feasible ...
Read more >KDD 2020: Hands on Tutorials: Deep Speed - YouTube
with over 100 billion parametersJing Zhao: Microsoft Bing; Yuxiong He: Microsoft; Samyam Rajbhandari: Microsoft; Hongzhi Li: Microsoft ...
Read more >Microsoft Research al Twitter: "Microsoft Research is ...
Microsoft Research is announcing DeepSpeed-MII, a new open-source Python library from DeepSpeed. MII speeds up Stable Diffusion inference by ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@mrwyattii I believe your commit yesterday has fixed this? Let me know. I am closely watching this repo 😃
https://github.com/huggingface/transformers-bloom-inference/blob/abe365066fec6e03ce0ea2cc8136f2da1254e2ea/bloom-inference-server/ds_inference/grpc_server.py#L33 @cderinbogaz I hacked my way around it for now I pass the downloaded model path and checkpoint dict for the model I need to use and the model=“bigscience/bloom”
I know this is not the most elegant method to do this 😦