Still problem with fp16 and without autocast
See original GitHub issueDescribe the bug
Hello! There was often discussed about this issue and as far as I understand it is meanwhile fixed. But I stll get a
RuntimeError: expected scalar type Half but found Float
when I try to run the model with fp16 but without autocast.
My code:
pipe0 = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
use_auth_token="hf_LFWSneVmdLYPKbkIRpCrCKVxx",
revision="fp16",
torch_dtype=torch.float16
).to("cuda:0")
image = pipe0(a, num_inference_steps=int(d), width=int(e), height=int(f), guidance_scale=float(c))["sample"]
I am on GPU (3090) and have the latest build 0.4.1 I tried it with different schedulers but if I do not use autocast I get this error message (with default scheduler):
File "generator1GPU.py", line 134, in heron1
image = pipe0(a, num_inference_steps=int(d), width=int(e), height=int(f), guidance_scale=float(c))["sample"]
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 219, in __call__
text_embeddings = self.text_encoder(text_input_ids.to(self.device))[0]
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 722, in forward
return self.text_model(
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 643, in forward
encoder_outputs = self.encoder(
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 574, in forward
layer_outputs = encoder_layer(
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 317, in forward
hidden_states, attn_weights = self.self_attn(
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 257, in forward
attn_output = torch.bmm(attn_probs, value_states)
RuntimeError: expected scalar type Half but found Float
or when using another scheduler, the error trace is different but point to the same error source, here with DDIMScheduler:
no sessions
text_embeddings = self.text_encoder(text_input_ids.to(self.device))[0]
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 722, in forward
return self.text_model(
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 643, in forward
encoder_outputs = self.encoder(
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 574, in forward
layer_outputs = encoder_layer(
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 317, in forward
hidden_states, attn_weights = self.self_attn(
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 257, in forward
attn_output = torch.bmm(attn_probs, value_states)
RuntimeError: expected scalar type Half but found Float
Best regards Marc
Reproduction
No response
Logs
No response
System Info
diffusers
version: 0.4.1- Platform: Linux-5.4.0-124-generic-x86_64-with-glibc2.10
- Python version: 3.8.5
- PyTorch version (GPU?): 1.11.0 (True)
- Huggingface_hub version: 0.10.0
- Transformers version: 4.19.2
EDIT: To my surprise this code works without throwing errors BUT the processing speed is the same than in the fp32 mode. Exactly the same. Only if adding autocast then it is faster. Obviously when formating the code in this way, the model does not understand that this is fp16. So no errors but also no speeding up.
```
pipe = StableDiffusionPipeline.from_pretrained(“CompVis/stable-diffusion-v1-4”, torch_type=torch.float16, revision=“fp16”,use_auth_token=“hf_LFWSneVmdLYPKbkIRpCrCKVnqgRxx”) pipe = pipe.to(“cuda:0”)
prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt)["sample"]
image[0].save("astronaut.jpg")
Issue Analytics
- State:
- Created a year ago
- Comments:5 (1 by maintainers)
Top GitHub Comments
What the … It is working now!! I changed thousand times my code, at the end I came back to the code I posted here and now it works! 14.46it/s - that is nice! 😃 But why does it suddenly work? Now the VRAM is also only filled with 4.5 GB. Strange. I will completely re-start my instance to see what happens then.
Best regards Marc
unet is a speed monster!