[Simultaneous Machine Translation-MMA]:Building 'alignment_train_cuda_binding' extension
See original GitHub issueBug
Building ‘alignment_train_cuda_binding’ extension, CUB building problem – Simultaneous Machine Translation Example
To Reproduce & Code sample
There is example how to use MMA model(Simultaneous Machine Translation) on page: https://github.com/pytorch/fairseq/blob/main/examples/simultaneous_translation/docs/ende-mma.md
- First, I follow the main page to install fairseq.
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./
I successfully installed fairseq.
- Then, I tried to run example code:
fairseq-train \
data-bin/wmt17_en_de \
--simul-type waitk \
--waitk-lagging 3 \
--mass-preservation \
--criterion label_smoothed_cross_entropy \
--max-update 50000 \
--arch transformer_monotonic_iwslt_de_en \
--save-dir checkpoints/monotonic_wmt_en_de \
--optimizer adam \
--adam-betas '(0.9, 0.98)' \
--lr-scheduler 'inverse_sqrt' \
--warmup-init-lr 1e-7 \
--warmup-updates 4000 \
--lr 5e-4 \
--stop-min-lr 1e-9 \
--clip-norm 0.0 \
--weight-decay 0.0001\
--dropout 0.3 \
--label-smoothing 0.1\
--max-tokens 3584 \
But it doesn’t work after commit with error, see error
ModuleNotFoundError: NO module named 'alignment_train_cuda_binding'
(I noticed that this is a new module updated before 30 days)
- To build relative extension package and pass the code ‘from alignment_train_cuda_binding import alignment_train_cuda’
- I set the CUDA_HOME path in ~/.bashrc and implemented the code in terminal
python setup.py build_ext --inplace
but when building ‘alignment_train_cuda_binding’ extension, see error
fatal error: cub/cub.cuh:no such file or directory
#include <cub/cub.cuh>
compilation terminated.
And I google this issue, someone say it should lack CUB package.
- So I git clone latest version CUB and put it in path: ‘/usr/local/cuda-10.2/targets/x86_64-linux/include/cub’
git clone https://github.com/NVIDIA/cub
/usr/local/cuda-10.2/targets/x86_64-linux/include/cub
- Again, I implemented the code in terminal
python setup.py build_ext --inplace
But a new problem happened, see error
building 'alignment_train_cuda_binding' extension
/usr/local/cuda-10.2/bin/nvcc -I xxx
.............
/usr/local/cuda-10.2/bin/../targets/x86_64-linux/include/cub/block/../iterator/cache_modified_input_iterator.cuh(116):error: a class or namespace qualified name is required
/usr/local/cuda-10.2/bin/../targets/x86_64-linux/include/cub/block/../iterator/cache_modified_input_iterator.cuh(116):error: qualified name is not allowed
/usr/local/cuda-10.2/bin/../targets/x86_64-linux/include/cub/block/../iterator/cache_modified_input_iterator.cuh(116):error: expected a ";"
/usr/local/cuda-10.2/bin/../targets/x86_64-linux/include/cub/agent/agent_merge_sort.cuh(80):error: a class or namespace qualified name is required
/usr/local/cuda-10.2/bin/../targets/x86_64-linux/include/cub/agent/agent_merge_sort.cuh(80):error: qualified name is not allowed
..............
error: command '/usr/local/cuda-10.2/bin/nvcc' failed with exit status 1
Expected behavior
Please help me to solve this issue.Can you tell me how to solve the problem? Thanks a lot!
I guess:
- whether the cuda10.2 don’t support this module ?
- And should I try to download a old version CUB library, and which version?
- or other methods? maybe I can install a old version fairseq(0.10.0) which don’t need module named ‘alignment_train_cuda_binding’.
Environment
- fairseq Version : main brach ;1.0.0a0+2380a6e (confused number)
- PyTorch Version : 1.10+cu
- OS : Ubuntu 18.04
- How you installed fairseq : pip install --editable ./
- Python version : 3.6.8 virtualenv
- CUDA/cuDNN version : cuda 10.2 / cuDNN temporary empty
- GPU models and configuration : Quadro RTX 5000
- Any other relevant information :
Additional context
<Sorry, because of privacy, I cannot upload code and picture of my error>
Issue Analytics
- State:
- Created 2 years ago
- Comments:15
Top GitHub Comments
I find the reason! that is because I use the units_to_segment from the enja.agent , actually , I should use yours:
Instead of git reset, git checkout is better.
git checkout dd3bd3c0497ae9a7ae7364404a6b0a4c501780b3
and
git checkout main
to go back to the main branch.