Error with the ninja building "sparse_attn"
See original GitHub issueIn python 3.6.7 pytorch 1.5.0 torchvision 0.6.0 cuda 10.2 gcc 5.4.0 ubuntu 16.04.5 LTS
Using /tmp/torch_extensions as PyTorch extensions root...
Emitting ninja build file /tmp/torch_extensions/sparse_attn/build.ninja...
Building extension module sparse_attn...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] c++ -MMD -MF utils.o.d -DTORCH_EXTENSION_NAME=sparse_attn -DTORCH_API_INCLUDE_EXTENSION_H -isystem /usr/local/lib/python3.6/dist-packages/torch/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.6/dist-packages/torch/include/THC -isystem /usr/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O2 -fopenmp -c /usr/local/lib/python3.6/dist-packages/deepspeed/ops/csrc/sparse_attention/utils.cpp -o utils.o
FAILED: utils.o
c++ -MMD -MF utils.o.d -DTORCH_EXTENSION_NAME=sparse_attn -DTORCH_API_INCLUDE_EXTENSION_H -isystem /usr/local/lib/python3.6/dist-packages/torch/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.6/dist-packages/torch/include/THC -isystem /usr/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O2 -fopenmp -c /usr/local/lib/python3.6/dist-packages/deepspeed/ops/csrc/sparse_attention/utils.cpp -o utils.o
/usr/local/lib/python3.6/dist-packages/deepspeed/ops/csrc/sparse_attention/utils.cpp: In function 'void segment_blocks(at::Tensor, at::Tensor, at::Tensor, int, ret_t&)':
/usr/local/lib/python3.6/dist-packages/deepspeed/ops/csrc/sparse_attention/utils.cpp:87:71: error: converting to 'std::vector<std::tuple<int, at::Tensor> >::value_type {aka std::tuple<int, at::Tensor>}' from initializer list would use explicit constructor 'constexpr std::tuple<_T1, _T2>::tuple(_U1&&, _U2&&) [with _U1 = int&; _U2 = at::Tensor; <template-parameter-2-3> = void; _T1 = int; _T2 = at::Tensor]'
if (!to_cat.empty()) ret.push_back({max_width, torch::cat(to_cat)});
^
/usr/local/lib/python3.6/dist-packages/deepspeed/ops/csrc/sparse_attention/utils.cpp: In function 'ret_t sdd_segment(at::Tensor, int)':
/usr/local/lib/python3.6/dist-packages/deepspeed/ops/csrc/sparse_attention/utils.cpp:110:90: warning: narrowing conversion of 'H' from 'size_t {aka long unsigned int}' to 'long int' inside { } [-Wnarrowing]
torch::Tensor scratch = torch::empty({H, layout.sum().item<int>(), 4}, layout.dtype());
^
/usr/local/lib/python3.6/dist-packages/deepspeed/ops/csrc/sparse_attention/utils.cpp:110:90: warning: narrowing conversion of 'H' from 'size_t {aka long unsigned int}' to 'long int' inside { } [-Wnarrowing]
ninja: build stopped: subcommand failed.
and
ERROR [01/26 16:37:15 fastreid.engine.train_loop]: Exception during training:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 1400, in _run_ninja_build
check=True)
File "/usr/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./fastreid/engine/train_loop.py", line 121, in train
self.run_step()
File "./fastreid/engine/train_loop.py", line 200, in run_step
outputs = self.model(data)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/data_parallel.py", line 153, in forward
return self.module(*inputs[0], **kwargs[0])
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "./fastreid/modeling/meta_arch/baseline.py", line 58, in forward
features = self.backbone(images) # (bs, 2048, 16, 8)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "./fastreid/modeling/backbones/sparse_transformer.py", line 415, in forward
x = self.forward_features(x)
File "./fastreid/modeling/backbones/sparse_transformer.py", line 408, in forward_features
x = blk(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "./fastreid/modeling/backbones/sparse_transformer.py", line 257, in forward
x = x + self.drop_path(self.attn(self.norm1(x)))
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "./fastreid/modeling/backbones/sparse_transformer.py", line 235, in forward
x = self.sparse_self_attn(q, k, v).transpose(1, 2).reshape(B, N, C)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/deepspeed/ops/sparse_attention/sparse_self_attention.py", line 152, in forward
attn_output_weights = sparse_dot_sdd_nt(query, key)
File "/usr/local/lib/python3.6/dist-packages/deepspeed/ops/sparse_attention/matmul.py", line 712, in __call__
db_lut, db_num_locks, db_width, db_packs = self.make_lut(a.dtype, a.device)
File "/usr/local/lib/python3.6/dist-packages/deepspeed/ops/sparse_attention/matmul.py", line 634, in make_lut
c_lut, c_num_locks, c_width, c_packs = _sparse_matmul.make_sdd_lut(layout, block, dtype, device)
File "/usr/local/lib/python3.6/dist-packages/deepspeed/ops/sparse_attention/matmul.py", line 99, in make_sdd_lut
_sparse_matmul._load_utils()
File "/usr/local/lib/python3.6/dist-packages/deepspeed/ops/sparse_attention/matmul.py", line 94, in _load_utils
_sparse_matmul.cpp_utils = SparseAttnBuilder().load()
File "/usr/local/lib/python3.6/dist-packages/deepspeed/ops/op_builder/builder.py", line 180, in load
return self.jit_load(verbose)
File "/usr/local/lib/python3.6/dist-packages/deepspeed/ops/op_builder/builder.py", line 216, in jit_load
verbose=verbose)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 898, in load
is_python_module)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 1086, in _jit_compile
with_cuda=with_cuda)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 1186, in _write_ninja_file_and_build_library
error_prefix="Error building extension '{}'".format(name))
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 1413, in _run_ninja_build
raise RuntimeError(message)
RuntimeError: Error building extension 'sparse_attn'
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (1 by maintainers)
Top Results From Across the Web
Ninja failed to build due to "The system cannot find the file ...
The command line I used to build Chromium is "ninja -C out/Debug chrome" (in src folder). Is there any suggestions to fix this...
Read more >Error occurs in Ninja while building CMake - Stack Overflow
Build command failed. Error while executing process D:\install\sdk\cmake\3.10. 2.4988404\bin\ninja.exe with arguments {-C D:\Android projects\ ...
Read more >Ninja build fails because of gcc - Reddit
And I have been constantly getting this error: ... Failed to preprocess host compiler properties. ninja: build stopped: subcommand failed.
Read more >sparse-attn/support-latest-triton · mirrors / microsoft / DeepSpeed
... be built just-in-time (JIT) using torch's JIT C++ extension loader that relies on ninja to build and dynamically link them at runtime....
Read more >Ninja Install Fail - ReScript Forum
Here is the gigantic error. ... Error: Command failed: ... No prebuilt Ninja, building Ninja now npm ERR! bootstrapping ninja... npm ERR!
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks,I update the gcc g++ from 5.4 to 7.5 ,then I success! from
sudo apt-get install gcc-7 g++-7
cd /usr/bin
sudo rm gcc
sudo ln -s gcc-7 gcc
sudo rm g++
sudo ln -s g++-7 g++
Thanks, It worked atfer I updated gcc and other components.