Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

The untuned fp16 stable diffusion model crashed

See original GitHub issue

Hi,

I’m trying to run the torch imported fp16 stable diffusion model and it seems to crash in the importing stage. Here’s the commandline I used:

python main.py --precision="fp16" --prompt="dog" --device="cuda" --import_mlir

GDB backtrace as follows:

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffebbf10a8a in mlir::Type::isInteger(unsigned int) const () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
(gdb) bt
#0  0x00007ffebbf10a8a in mlir::Type::isInteger(unsigned int) const () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#1  0x00007ffebd2fb0f8 in (anonymous namespace)::TypeAnalysis::visitOperation(mlir::Operation*, llvm::ArrayRef<mlir::dataflow::Lattice<(anonymous namespace)::ValueKnowledge> const*>, llvm::ArrayRef<mlir::dataflow::Lattice<(anonymous namespace)::ValueKnowledge>*>) ()
   from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#2  0x00007ffebec5a191 in mlir::dataflow::AbstractSparseDataFlowAnalysis::visitOperation(mlir::Operation*) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#3  0x00007ffebec5b07c in mlir::dataflow::AbstractSparseDataFlowAnalysis::initializeRecursively(mlir::Operation*) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#4  0x00007ffebec5b12b in mlir::dataflow::AbstractSparseDataFlowAnalysis::initializeRecursively(mlir::Operation*) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#5  0x00007ffebec3cecd in mlir::DataFlowSolver::initializeAndRun(mlir::Operation*) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#6  0x00007ffebd2fc768 in (anonymous namespace)::RefineTypesPass::runOnOperation() () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#7  0x00007ffebbda0979 in mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#8  0x00007ffebbda1341 in mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) ()
   from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#9  0x00007ffebbd9fd96 in mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#10 0x00007ffebbda071e in mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#11 0x00007ffebbda1341 in mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) ()
   from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#12 0x00007ffebbda1913 in mlir::LogicalResult llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::{lambda(mlir::OpPassManager&, mlir::Operation*)#1}>(long, mlir::OpPassManager&, mlir::Operation*) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#13 0x00007ffebd2e32ed in (anonymous namespace)::LowerToBackendContractPass::runOnOperation() () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#14 0x00007ffebbda0979 in mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#15 0x00007ffebbda1341 in mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) ()
   from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#16 0x00007ffebbda236c in mlir::PassManager::run(mlir::Operation*) () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#17 0x00007ffebbd42e99 in mlirPassManagerRun () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.so
#18 0x00007ffeda7b3271 in ?? () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/_mlir.cpython-310-x86_64-linux-gnu.so
#19 0x00007ffeda71886e in ?? () from /home/scratch.yuayao_inf/environments/iree-ubuntu/lib/python3.10/site-packages/torch_mlir/_mlir_libs/_mlir.cpython-310-x86_64-linux-gnu.so

Issue Analytics

State:
Created 10 months ago
Comments:10 (3 by maintainers)

Top GitHub Comments

1reaction

pashu123commented, Nov 8, 2022

For the fp16 version: you need this branch of torch-mlir https://github.com/pashu123/torch-mlir/tree/refine_check. Also, you need to compile torch-mlir with torch’s CUDA version.

1reaction

yaoyuannnncommented, Nov 8, 2022

Thanks @powderluv. Right, the tuned fp16 MLIR input (unet_fp16_tunedv2_torch.mlir) seems to only work for Vulkan.