The untuned fp16 stable diffusion model crashed
See original GitHub issueHi,
I’m trying to run the torch imported fp16 stable diffusion model and it seems to crash in the importing stage. Here’s the commandline I used:
python main.py --precision="fp16" --prompt="dog" --device="cuda" --import_mlir
GDB backtrace as follows:
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffebbf10a8a in mlir::Type::isInteger(unsigned int) const () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
(gdb) bt
#0 0x00007ffebbf10a8a in mlir::Type::isInteger(unsigned int) const () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#1 0x00007ffebd2fb0f8 in (anonymous namespace)::TypeAnalysis::visitOperation(mlir::Operation*, llvm::ArrayRef<mlir::dataflow::Lattice<(anonymous namespace)::ValueKnowledge> const*>, llvm::ArrayRef<mlir::dataflow::Lattice<(anonymous namespace)::ValueKnowledge>*>) ()
from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#2 0x00007ffebec5a191 in mlir::dataflow::AbstractSparseDataFlowAnalysis::visitOperation(mlir::Operation*) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#3 0x00007ffebec5b07c in mlir::dataflow::AbstractSparseDataFlowAnalysis::initializeRecursively(mlir::Operation*) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#4 0x00007ffebec5b12b in mlir::dataflow::AbstractSparseDataFlowAnalysis::initializeRecursively(mlir::Operation*) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#5 0x00007ffebec3cecd in mlir::DataFlowSolver::initializeAndRun(mlir::Operation*) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#6 0x00007ffebd2fc768 in (anonymous namespace)::RefineTypesPass::runOnOperation() () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#7 0x00007ffebbda0979 in mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#8 0x00007ffebbda1341 in mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) ()
from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#9 0x00007ffebbd9fd96 in mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#10 0x00007ffebbda071e in mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#11 0x00007ffebbda1341 in mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) ()
from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#12 0x00007ffebbda1913 in mlir::LogicalResult llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::{lambda(mlir::OpPassManager&, mlir::Operation*)#1}>(long, mlir::OpPassManager&, mlir::Operation*) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#13 0x00007ffebd2e32ed in (anonymous namespace)::LowerToBackendContractPass::runOnOperation() () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#14 0x00007ffebbda0979 in mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#15 0x00007ffebbda1341 in mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) ()
from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#16 0x00007ffebbda236c in mlir::PassManager::run(mlir::Operation*) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#17 0x00007ffebbd42e99 in mlirPassManagerRun () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#18 0x00007ffeda7b3271 in ?? () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/_mlir.cpython-310-x86_64-linux-gnu.so
#19 0x00007ffeda71886e in ?? () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/_mlir.cpython-310-x86_64-linux-gnu.so
Issue Analytics
- State:
- Created 10 months ago
- Comments:10 (3 by maintainers)
Top Results From Across the Web
How do I fix this error in the notebook? 12 - Hugging Face
HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/api/models/CompVis/stable-diffusion-v1-4/revision/fp16 (Request ID: ...
Read more >Non-Half Float Compatible Models · Issue #1320 - GitHub
Describe the bug Is there a way to check if a model is compatible before Diffuses yells at me that it can't be...
Read more >I tried to build a ML Text to Image App with Stable Diffusion in ...
What's happening guys, welcome to the sixth episode of CodeThat!? I thinkSo there's been a lotta talk about text to image generation using ......
Read more >How to Train Stable Diffusion AI with Your Face to Create Art ...
This article will demonstrate how to train Stable Diffusion model ... out of memory and crash, requiring you to restart from the beginning....
Read more >Generate images from text with the stable diffusion model on ...
To use a large model such as Stable Diffusion in Amazon SageMaker, you need inference scripts and end-to-end tests for scripts, models, and...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
For the fp16 version: you need this branch of torch-mlir https://github.com/pashu123/torch-mlir/tree/refine_check. Also, you need to compile torch-mlir with torch’s CUDA version.
Thanks @powderluv. Right, the tuned fp16 MLIR input (unet_fp16_tunedv2_torch.mlir) seems to only work for Vulkan.