question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

fp16 compatibility StopIteration on Multiple GPU's: Text Classification of MultiNLI Sentences using BERT

See original GitHub issue

Hello! I hope you’re doing great. On the other hand, I had this issue while running this jupyter notebook Text Classification of MultiNLI Sentences using BERT. Environment: On-Premises Computer: Macbook Pro 16" CPU: intel i9 9980HK RAM: 64GB GPU: 2 x TITAN RTX 24GB in RAM GPU Enclosure: 2 x Razer Core X Chrome Thunderbold 3 (1 connected to the left and one connected to the right) Conda Version: conda 4.8.3 Python Version: Python 3.7.7

Packages Installed:

Name                    Version                   Build  Channel
_anaconda_depends         2020.02                  py37_0
_pytorch_select           1.1.0                       cpu
_r-mutex                  1.0.0               anacondar_1
_tflow_select             2.1.0                       gpu
absl-py                   0.9.0                    py37_0
alabaster                 0.7.12                   py37_0
anaconda                  custom                   py37_1
anaconda-client           1.7.2                    py37_0
anaconda-project          0.8.4                      py_0
argh                      0.26.2                   py37_0
asn1crypto                1.3.0                    py37_0
astor                     0.8.0                    py37_0
astroid                   2.3.3                    py37_0
astropy                   4.0.1.post1      py37he774522_1
atomicwrites              1.4.0                      py_0
attrs                     19.3.0                     py_0
autopep8                  1.4.4                      py_0
babel                     2.8.0                      py_0
backcall                  0.1.0                    py37_0
backports                 1.0                        py_2
backports.shutil_get_terminal_size 1.0.0                    py37_2
bcrypt                    3.1.7            py37he774522_0
beautifulsoup4            4.9.0                    py37_0
bitarray                  1.2.1            py37he774522_0
bkcharts                  0.2                      py37_0
blas                      1.0                         mkl
bleach                    3.1.4                      py_0
blinker                   1.4                      py37_0
blis                      0.4.1                    pypi_0    pypi
blosc                     1.16.3               h7bd577a_0
bokeh                     2.0.2                    py37_0
boto                      2.49.0                   py37_0
boto3                     1.13.13                  pypi_0    pypi
botocore                  1.16.13                  pypi_0    pypi
bottleneck                1.3.2            py37h2a96729_0
brotli                    1.0.7                    pypi_0    pypi
bzip2                     1.0.8                he774522_0
ca-certificates           2020.4.5.1           hecc5488_0    conda-forge
cached-property           1.5.1                    pypi_0    pypi
cachetools                3.1.1                      py_0
catalogue                 1.0.0                    pypi_0    pypi
certifi                   2020.4.5.1       py37hc8dfbb8_0    conda-forge
cffi                      1.14.0           py37h7a1dbc1_0
chardet                   3.0.4                 py37_1003
click                     7.1.2                      py_0
cloudpickle               1.4.1                      py_0
clyent                    1.2.2                    py37_1
colorama                  0.4.3                      py_0
comtypes                  1.1.7                    py37_0
console_shortcut          0.1.1                         4
contextlib2               0.6.0.post1                py_0
cryptography              2.9.2            py37h7a1dbc1_0
cssselect                 1.1.0                    pypi_0    pypi
cudatoolkit               10.1.243             h74a9793_0
cudnn                     7.6.5                cuda10.1_0
curl                      7.69.1               h2a8f88b_0
cycler                    0.10.0                   py37_0
cymem                     2.0.3                    pypi_0    pypi
cython                    0.29.17          py37ha925a31_0
cytoolz                   0.10.1           py37he774522_0
dash                      1.12.0                   pypi_0    pypi
dash-core-components      1.10.0                   pypi_0    pypi
dash-cytoscape            0.1.1                    pypi_0    pypi
dash-html-components      1.0.3                    pypi_0    pypi
dash-renderer             1.4.1                    pypi_0    pypi
dash-table                4.7.0                    pypi_0    pypi
dask                      2.16.0                     py_0
dask-core                 2.16.0                     py_0
decorator                 4.4.2                      py_0
defusedxml                0.6.0                      py_0
diff-match-patch          20181111                   py_0
dill                      0.3.1.1                  pypi_0    pypi
distributed               2.16.0                   py37_0
docutils                  0.15.2                   pypi_0    pypi
entrypoints               0.3                      py37_0
et_xmlfile                1.0.1                    py37_0
fastcache                 1.1.0            py37he774522_0
filelock                  3.0.12                     py_0
flake8                    3.7.9                    py37_0
flask                     1.1.2                    pypi_0    pypi
flask-compress            1.5.0                    pypi_0    pypi
freetype                  2.9.1                ha9979f8_1
fsspec                    0.7.1                      py_0
future                    0.18.2                   py37_0
gast                      0.2.2                    py37_0
get_terminal_size         1.0.0                h38e98db_0
gevent                    20.5.0                   pypi_0    pypi
glob2                     0.7                        py_0
google-auth               1.14.1                     py_0
google-auth-oauthlib      0.4.1                      py_2
google-pasta              0.2.0                      py_0
greenlet                  0.4.15           py37hfa6e2cd_0
grpcio                    1.27.2           py37h351948d_0
h5py                      2.10.0           py37h5e291fa_0
hdf5                      1.10.4               h7ebc959_0
heapdict                  1.0.1                      py_0
html5lib                  1.0.1                    py37_0
hypothesis                5.11.0                     py_0
icc_rt                    2019.0.0             h0cc432a_1
icu                       58.2                 ha925a31_3
idna                      2.9                        py_1
imagecodecs               2020.2.18                pypi_0    pypi
imageio                   2.8.0                      py_0
imagesize                 1.2.0                      py_0
importlib_metadata        1.5.0                    py37_0
intel-openmp              2020.1                      216
interpret                 0.1.22                   pypi_0    pypi
interpret-community       0.11.1                   pypi_0    pypi
interpret-core            0.1.21                   pypi_0    pypi
interpret-text            0.1.1                    pypi_0    pypi
intervaltree              3.0.2                      py_0
ipykernel                 5.1.4            py37h39e3cac_0
ipython                   7.13.0           py37h5ca1d4c_0
ipython_genutils          0.2.0                    py37_0
ipywidgets                7.5.1                      py_0
isort                     4.3.21                   py37_0
itsdangerous              1.1.0                    py37_0
jdcal                     1.4.1                      py_0
jedi                      0.15.2                   py37_0
jinja2                    2.11.2                     py_0
jmespath                  0.10.0                   pypi_0    pypi
joblib                    0.14.1                     py_0
jpeg                      9b                   hb83a4c4_2
json5                     0.9.4                      py_0
jsonschema                3.2.0                    py37_0
jupyter                   1.0.0                    py37_7
jupyter_client            6.1.3                      py_0
jupyter_console           6.1.0                      py_0
jupyter_contrib_core      0.3.3                      py_2    conda-forge
jupyter_contrib_nbextensions 0.5.1                    py37_0    conda-forge
jupyter_core              4.6.3                    py37_0
jupyter_highlight_selected_word 0.2.0                 py37_1000    conda-forge
jupyter_latex_envs        1.4.4                 py37_1000    conda-forge
jupyter_nbextensions_configurator 0.4.1                    py37_0    conda-forge
jupyterlab                1.2.6              pyhf63ae98_0
jupyterlab_server         1.1.1                      py_0
keras                     2.3.1                         0
keras-applications        1.0.8                      py_0
keras-base                2.3.1                    py37_0
keras-preprocessing       1.1.0                      py_1
keyring                   21.1.1                   py37_2
kiwisolver                1.2.0            py37h74a9793_0
krb5                      1.17.1               hc04afaa_0
lazy-object-proxy         1.4.3            py37he774522_0
libarchive                3.3.3                h0643e63_5
libcurl                   7.69.1               h2a8f88b_0
libiconv                  1.15                 h1df5818_7
liblief                   0.10.1               ha925a31_0
libpng                    1.6.37               h2a8f88b_0
libprotobuf               3.11.4               h7bd577a_0
libsodium                 1.0.16               h9d3ae62_0
libspatialindex           1.9.3                h33f27b4_0
libssh2                   1.9.0                h7a1dbc1_1
libtiff                   4.1.0                h56a325e_0
libxml2                   2.9.9                h464c3ec_0
libxslt                   1.1.33               h579f668_0
lime                      0.2.0.0                  pypi_0    pypi
llvmlite                  0.32.1           py37ha925a31_0
locket                    0.2.0                    py37_1
lxml                      4.5.1                    pypi_0    pypi
lz4-c                     1.8.1.2              h2fa13f4_0
lzo                       2.10                 he774522_2
m2w64-bwidget             1.9.10                        2
m2w64-bzip2               1.0.6                         6
m2w64-expat               2.1.1                         2
m2w64-fftw                3.3.4                         6
m2w64-flac                1.3.1                         3
m2w64-gcc-libgfortran     5.3.0                         6
m2w64-gcc-libs            5.3.0                         7
m2w64-gcc-libs-core       5.3.0                         7
m2w64-gettext             0.19.7                        2
m2w64-gmp                 6.1.0                         2
m2w64-gsl                 2.1                           2
m2w64-libiconv            1.14                          6
m2w64-libjpeg-turbo       1.4.2                         3
m2w64-libogg              1.3.2                         3
m2w64-libpng              1.6.21                        2
m2w64-libsndfile          1.0.26                        2
m2w64-libsodium           1.0.10                        2
m2w64-libtiff             4.0.6                         2
m2w64-libvorbis           1.3.5                         2
m2w64-libwinpthread-git   5.0.0.4634.697f757               2
m2w64-libxml2             2.9.3                         4
m2w64-mpfr                3.1.4                         4
m2w64-openblas            0.2.19                        1
m2w64-pcre                8.38                          2
m2w64-speex               1.2rc2                        3
m2w64-speexdsp            1.2rc3                        3
m2w64-tcl                 8.6.5                         3
m2w64-tk                  8.6.5                         3
m2w64-tktable             2.10                          5
m2w64-wineditline         2.101                         5
m2w64-xz                  5.2.2                         2
m2w64-zeromq              4.1.4                         2
m2w64-zlib                1.2.8                        10
markdown                  3.1.1                    py37_0
markupsafe                1.1.1            py37he774522_0
matplotlib                3.1.3                    py37_0
matplotlib-base           3.1.3            py37h64f37c6_0
mccabe                    0.6.1                    py37_1
menuinst                  1.4.16           py37he774522_0
mistune                   0.8.4            py37he774522_0
mkl                       2020.1                      216
mkl-service               2.3.0            py37hb782905_0
mkl_fft                   1.0.15           py37h14836fe_0
mkl_random                1.1.0            py37h675688f_0
mock                      4.0.2                      py_0
more-itertools            8.2.0                      py_0
mpmath                    1.1.0                    py37_0
msgpack-python            1.0.0            py37h74a9793_1
msys2-conda-epoch         20160418                      1
multipledispatch          0.6.0                    py37_0
murmurhash                1.0.2                    pypi_0    pypi
nbconvert                 5.6.1                    py37_0
nbformat                  5.0.6                      py_0
networkx                  2.4                        py_0
ninja                     1.9.0            py37h74a9793_0
nltk                      3.4.5                    py37_0
nose                      1.3.7                    py37_2
notebook                  6.0.3                    py37_0
numba                     0.49.1           py37h47e9c7a_0
numexpr                   2.7.1            py37h25d0782_0
numpy                     1.18.1           py37h93ca92e_0
numpy-base                1.18.1           py37hc3f5095_1
numpydoc                  0.9.2                      py_0
oauthlib                  3.1.0                      py_0
olefile                   0.46                     py37_0
openpyxl                  3.0.3                      py_0
openssl                   1.1.1g               he774522_0    conda-forge
opt_einsum                3.1.0                      py_0
packaging                 20.4                     pypi_0    pypi
pandas                    1.0.3            py37h47e9c7a_0
pandoc                    2.2.3.2                       0
pandocfilters             1.4.2                    py37_1
paramiko                  2.7.1                      py_0
parsel                    1.6.0                    pypi_0    pypi
parso                     0.5.2                      py_0
partd                     1.1.0                      py_0
path                      13.1.0                   py37_0
path.py                   12.4.0                        0
pathlib2                  2.3.5                    py37_0
pathtools                 0.1.2                      py_1
patsy                     0.5.1                    py37_0
pep8                      1.7.1                    py37_0
pexpect                   4.8.0                    py37_0
pickleshare               0.7.5                    py37_0
pillow                    5.4.1                    pypi_0    pypi
pip                       20.0.2                   py37_3
pkginfo                   1.5.0.1                  py37_0
plac                      1.1.3                    pypi_0    pypi
plotly                    4.7.1                    pypi_0    pypi
pluggy                    0.13.1                   py37_0
ply                       3.11                     py37_0
powershell_shortcut       0.0.1                         3
preshed                   3.0.2                    pypi_0    pypi
prometheus_client         0.7.1                      py_0
prompt-toolkit            3.0.4                      py_0
prompt_toolkit            3.0.4                         0
protobuf                  3.11.4           py37h33f27b4_0
psutil                    5.7.0            py37he774522_0
py                        1.8.1                      py_0
py-lief                   0.10.1           py37ha925a31_0
pyasn1                    0.4.8                      py_0
pyasn1-modules            0.2.7                      py_0
pycodestyle               2.5.0                    py37_0
pycosat                   0.6.3            py37he774522_0
pycparser                 2.20                       py_0
pycrypto                  2.6.1            py37hfa6e2cd_9
pycurl                    7.43.0.5         py37h7a1dbc1_0
pydantic                  1.5.1                    pypi_0    pypi
pydocstyle                4.0.1                      py_0
pyflakes                  2.1.1                    py37_0
pygments                  2.6.1                      py_0
pyjwt                     1.7.1                    py37_0
pylint                    2.4.4                    py37_0
pynacl                    1.3.0            py37h62dcd97_0
pyodbc                    4.0.30           py37ha925a31_0
pyopenssl                 19.1.0                   py37_0
pyparsing                 2.4.7                      py_0
pyqt                      5.9.2            py37h6538335_2
pyreadline                2.1                      py37_1
pyrsistent                0.16.0           py37he774522_0
pysocks                   1.7.1                    py37_0
pytables                  3.6.1            py37h1da0976_0
pytest                    5.4.2                    py37_0
pytest-arraydiff          0.3              py37h39e3cac_0
pytest-astropy            0.8.0                      py_0
pytest-astropy-header     0.1.2                      py_0
pytest-doctestplus        0.5.0                      py_0
pytest-openfiles          0.5.0                      py_0
pytest-remotedata         0.3.2                    py37_0
python                    3.7.7                h81c818b_4
python-dateutil           2.8.1                      py_0
python-jsonrpc-server     0.3.4                      py_0
python-language-server    0.31.10                  py37_0
python-libarchive-c       2.9                        py_0
python_abi                3.7                     1_cp37m    conda-forge
pytorch                   1.5.0           py3.7_cuda101_cudnn7_0    pytorch
pytorch-pretrained-bert   0.6.2                    pypi_0    pypi
pytz                      2020.1                     py_0
pywavelets                1.1.1            py37he774522_0
pywin32                   227              py37he774522_1
pywin32-ctypes            0.2.0                 py37_1000
pywinpty                  0.5.7                    py37_0
pyyaml                    5.3.1            py37he774522_0
pyzmq                     18.1.1           py37ha925a31_0
qdarkstyle                2.8.1                      py_0
qt                        5.9.7            vc14h73c81de_0
qtawesome                 0.7.0                      py_0
qtconsole                 4.7.4                      py_0
qtpy                      1.9.0                      py_0
r-askpass                 1.0                       r36_0
r-assertthat              0.2.1             r36h6115d3f_0
r-backports               1.1.4             r36h6115d3f_0
r-base                    3.6.1                hf18239d_1
r-base64enc               0.1_3             r36h6115d3f_4
r-bh                      1.69.0_1          r36h6115d3f_0
r-boot                    1.3_20            r36h6115d3f_0
r-broom                   0.5.2             r36h6115d3f_0
r-callr                   3.2.0             r36h6115d3f_0
r-caret                   6.0_83            r36h6115d3f_0
r-cellranger              1.1.0             r36h6115d3f_0
r-class                   7.3_15            r36h6115d3f_0
r-cli                     1.1.0             r36h6115d3f_0
r-clipr                   0.6.0             r36h6115d3f_0
r-cluster                 2.0.8             r36h6115d3f_0
r-codetools               0.2_16            r36h6115d3f_0
r-colorspace              1.4_1             r36h6115d3f_0
r-crayon                  1.3.4             r36h6115d3f_0
r-curl                    3.3               r36h6115d3f_0
r-data.table              1.12.2            r36h6115d3f_0
r-dbi                     1.0.0             r36h6115d3f_0
r-dbplyr                  1.4.0             r36h6115d3f_0
r-dichromat               2.0_0             r36h6115d3f_4
r-digest                  0.6.18            r36h6115d3f_0
r-dplyr                   0.8.0.1           r36h6115d3f_0
r-ellipsis                0.1.0             r36h6115d3f_0
r-essentials              3.6.0                     r36_0
r-evaluate                0.13              r36h6115d3f_0
r-fansi                   0.4.0             r36h6115d3f_0
r-forcats                 0.4.0             r36h6115d3f_0
r-foreach                 1.4.4             r36h6115d3f_0
r-foreign                 0.8_71            r36h6115d3f_0
r-formatr                 1.6               r36h6115d3f_0
r-fs                      1.2.7             r36h6115d3f_0
r-generics                0.0.2             r36h6115d3f_0
r-ggplot2                 3.1.1             r36h6115d3f_0
r-glmnet                  2.0_16            r36h6115d3f_0
r-glue                    1.3.1             r36h6115d3f_0
r-gower                   0.2.0             r36h6115d3f_0
r-gtable                  0.3.0             r36h6115d3f_0
r-haven                   2.1.0             r36h6115d3f_0
r-hexbin                  1.27.2            r36h6115d3f_0
r-highr                   0.8               r36h6115d3f_0
r-hms                     0.4.2             r36h6115d3f_0
r-htmltools               0.3.6             r36h6115d3f_0
r-htmlwidgets             1.3               r36h6115d3f_0
r-httpuv                  1.5.1             r36h6115d3f_0
r-httr                    1.4.0             r36h6115d3f_0
r-ipred                   0.9_8             r36h6115d3f_0
r-irdisplay               0.7.0             r36h6115d3f_0
r-irkernel                0.8.15                    r36_0
r-iterators               1.0.10            r36h6115d3f_0
r-jsonlite                1.6               r36h6115d3f_0
r-kernsmooth              2.23_15           r36h6115d3f_4
r-knitr                   1.22              r36h6115d3f_0
r-labeling                0.3               r36h6115d3f_4
r-later                   0.8.0             r36h6115d3f_0
r-lattice                 0.20_38           r36h6115d3f_0
r-lava                    1.6.5             r36h6115d3f_0
r-lazyeval                0.2.2             r36h6115d3f_0
r-lubridate               1.7.4             r36h6115d3f_0
r-magrittr                1.5               r36h6115d3f_4
r-maps                    3.3.0             r36h6115d3f_0
r-markdown                0.9               r36h6115d3f_0
r-mass                    7.3_51.3          r36h6115d3f_0
r-matrix                  1.2_17            r36h6115d3f_0
r-mgcv                    1.8_28            r36h6115d3f_0
r-mime                    0.6               r36h6115d3f_0
r-modelmetrics            1.2.2             r36h6115d3f_0
r-modelr                  0.1.4             r36h6115d3f_0
r-munsell                 0.5.0             r36h6115d3f_0
r-nlme                    3.1_139           r36h6115d3f_0
r-nnet                    7.3_12            r36h6115d3f_0
r-numderiv                2016.8_1          r36h6115d3f_0
r-openssl                 1.3               r36h6115d3f_0
r-pbdzmq                  0.3_3             r36h6115d3f_0
r-pillar                  1.3.1             r36h6115d3f_0
r-pkgconfig               2.0.2             r36h6115d3f_0
r-plogr                   0.2.0             r36h6115d3f_0
r-plyr                    1.8.4             r36h6115d3f_0
r-prettyunits             1.0.2             r36h6115d3f_0
r-processx                3.3.0             r36h6115d3f_0
r-prodlim                 2018.04.18        r36h6115d3f_0
r-progress                1.2.0             r36h6115d3f_0
r-promises                1.0.1             r36h6115d3f_0
r-ps                      1.3.0             r36h6115d3f_0
r-purrr                   0.3.2             r36h6115d3f_0
r-quantmod                0.4_14            r36h6115d3f_0
r-r6                      2.4.0             r36h6115d3f_0
r-randomforest            4.6_14            r36h6115d3f_0
r-rbokeh                  0.6.3                     r36_0
r-rcolorbrewer            1.1_2             r36h6115d3f_0
r-rcpp                    1.0.1             r36h6115d3f_0
r-rcpproll                0.3.0             r36h6115d3f_0
r-readr                   1.3.1             r36h6115d3f_0
r-readxl                  1.3.1             r36h6115d3f_0
r-recipes                 0.1.5             r36h6115d3f_0
r-recommended             3.6.0                     r36_0
r-rematch                 1.0.1             r36h6115d3f_0
r-repr                    0.19.2            r36h6115d3f_0
r-reprex                  0.2.1             r36h6115d3f_0
r-reshape2                1.4.3             r36h6115d3f_0
r-rlang                   0.3.4             r36h6115d3f_0
r-rmarkdown               1.12              r36h6115d3f_0
r-rpart                   4.1_15            r36h6115d3f_0
r-rstudioapi              0.10              r36h6115d3f_0
r-rvest                   0.3.3             r36h6115d3f_0
r-scales                  1.0.0             r36h6115d3f_0
r-selectr                 0.4_1             r36h6115d3f_0
r-shiny                   1.3.2             r36h6115d3f_0
r-sourcetools             0.1.7             r36h6115d3f_0
r-spatial                 7.3_11            r36h6115d3f_4
r-squarem                 2017.10_1         r36h6115d3f_0
r-stringi                 1.4.3             r36h6115d3f_0
r-stringr                 1.4.0             r36h6115d3f_0
r-survival                2.44_1.1          r36h6115d3f_0
r-sys                     3.2               r36h6115d3f_0
r-tibble                  2.1.1             r36h6115d3f_0
r-tidyr                   0.8.3             r36h6115d3f_0
r-tidyselect              0.2.5             r36h6115d3f_0
r-tidyverse               1.2.1             r36h6115d3f_0
r-timedate                3043.102          r36h6115d3f_0
r-tinytex                 0.12              r36h6115d3f_0
r-ttr                     0.23_4            r36h6115d3f_0
r-utf8                    1.1.4             r36h6115d3f_0
r-uuid                    0.1_2             r36h6115d3f_4
r-viridislite             0.3.0             r36h6115d3f_0
r-whisker                 0.3_2             r36h6115d3f_4
r-withr                   2.1.2             r36h6115d3f_0
r-xfun                    0.6               r36h6115d3f_0
r-xml2                    1.2.0             r36h6115d3f_0
r-xtable                  1.8_4             r36h6115d3f_0
r-xts                     0.11_2            r36h6115d3f_0
r-yaml                    2.2.0             r36h6115d3f_0
r-zoo                     1.8_5             r36h6115d3f_0
regex                     2020.5.14                pypi_0    pypi
requests                  2.23.0                   py37_0
requests-oauthlib         1.3.0                      py_0
retrying                  1.3.3                    pypi_0    pypi
rope                      0.17.0                     py_0
rsa                       4.0                        py_0
rtree                     0.9.4            py37h21ff451_1
ruamel_yaml               0.15.87          py37he774522_0
s3transfer                0.3.3                    pypi_0    pypi
sacremoses                0.0.43                   pypi_0    pypi
salib                     1.3.11                   pypi_0    pypi
scikit-image              0.17.2                   pypi_0    pypi
scikit-learn              0.22.1           py37h6288b17_0
scipy                     1.4.1            py37h9439919_0
scrapbook                 0.2.0                    pypi_0    pypi
seaborn                   0.10.1                     py_0
send2trash                1.5.0                    py37_0
sentencepiece             0.1.90                   pypi_0    pypi
setuptools                46.4.0                   py37_0
shap                      0.29.3                   pypi_0    pypi
simplegeneric             0.8.1                    py37_2
singledispatch            3.4.0.3                  py37_0
sip                       4.19.8           py37h6538335_0
six                       1.14.0                   py37_0
snappy                    1.1.7                h777316e_3
snowballstemmer           2.0.0                      py_0
sortedcollections         1.1.2                    py37_0
sortedcontainers          2.1.0                    py37_0
soupsieve                 2.0                        py_0
spacy                     2.2.4                    pypi_0    pypi
sphinx                    3.0.3                      py_0
sphinxcontrib             1.0                      py37_1
sphinxcontrib-applehelp   1.0.2                      py_0
sphinxcontrib-devhelp     1.0.2                      py_0
sphinxcontrib-htmlhelp    1.0.3                      py_0
sphinxcontrib-jsmath      1.0.1                      py_0
sphinxcontrib-qthelp      1.0.3                      py_0
sphinxcontrib-serializinghtml 1.1.4                      py_0
sphinxcontrib-websupport  1.2.1                      py_0
spyder                    4.1.3                    py37_0
spyder-kernels            1.9.1                    py37_0
sqlalchemy                1.3.16           py37he774522_0
sqlite                    3.31.1               h2a8f88b_1
srsly                     1.0.2                    pypi_0    pypi
statsmodels               0.11.0           py37he774522_0
sympy                     1.5.1                    py37_0
tbb                       2020.0               h74a9793_0
tblib                     1.6.0                      py_0
tensorboard               2.1.0                     py3_0
tensorflow                2.1.0           gpu_py37h7db9008_0
tensorflow-base           2.1.0           gpu_py37h55f5790_0
tensorflow-estimator      2.1.0              pyhd54b08b_0
tensorflow-gpu            2.1.0                h0d30ee6_0
termcolor                 1.1.0                    py37_1
terminado                 0.8.3                    py37_0
testpath                  0.4.4                      py_0
thinc                     7.4.0                    pypi_0    pypi
tifffile                  2020.5.11                pypi_0    pypi
tk                        8.6.8                hfa6e2cd_0
tokenizers                0.0.11                   pypi_0    pypi
toolz                     0.10.0                     py_0
torchvision               0.6.0                py37_cu101    pytorch
tornado                   6.0.4            py37he774522_1
tqdm                      4.46.0                     py_0
traitlets                 4.3.3                    py37_0
transformers              2.4.1                    pypi_0    pypi
treeinterpreter           0.2.2                    pypi_0    pypi
typing_extensions         3.7.4.1                  py37_0
ujson                     1.35             py37hfa6e2cd_0
unicodecsv                0.14.1                   py37_0
urllib3                   1.25.8                   py37_0
vc                        14.1                 h0510ff6_4
vs2015_runtime            14.16.27012          hf0eaf9b_1
w3lib                     1.22.0                   pypi_0    pypi
wasabi                    0.6.0                    pypi_0    pypi
watchdog                  0.10.2                   py37_0
wcwidth                   0.1.9                      py_0
webencodings              0.5.1                    py37_1
werkzeug                  1.0.1                    pypi_0    pypi
wheel                     0.34.2                   py37_0
widgetsnbextension        3.5.1                    py37_0
win_inet_pton             1.1.0                    py37_0
win_unicode_console       0.5                      py37_0
wincertstore              0.2                      py37_0
winpty                    0.4.3                         4
wrapt                     1.12.1           py37he774522_1
xgboost                   1.1.0                    pypi_0    pypi
xlrd                      1.2.0                    py37_0
xlsxwriter                1.2.8                      py_0
xlwings                   0.19.0                   py37_0
xlwt                      1.3.0                    py37_0
xz                        5.2.5                h62dcd97_0
yaml                      0.1.7                hc54c509_2
yapf                      0.28.0                     py_0
zeromq                    4.3.1                h33f27b4_3
zict                      2.0.0                      py_0
zipp                      3.1.0                      py_0
zlib                      1.2.11               h62dcd97_4
zstd                      1.3.7                h508b16e_0

When running this line:

with Timer() as t:
    classifier.fit(token_ids=tokens_train,
                    input_mask=mask_train,
                    labels=labels_train,    
                    num_epochs=NUM_EPOCHS,
                    batch_size=BATCH_SIZE,    
                    verbose=True)        
print("[Training time: {:.3f} hrs]".format(t.interval / 3600))

I got the following stack trace:

t_total value of -1 results in schedule not being applied
Iteration:   0%|          | 0/79 [00:00<?, ?it/s]

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-34-18e84990dbbe> in <module>
      5                     num_epochs=NUM_EPOCHS,
      6                     batch_size=BATCH_SIZE,
----> 7                     verbose=True)
      8 
      9 print("[Training time: {:.3f} hrs]".format(t.interval / 3600))

C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\interpret_text\experimental\common\utils_bert.py in fit(self, token_ids, input_mask, labels, token_type_ids, num_gpus, num_epochs, batch_size, lr, warmup_proportion, verbose)
    550                     token_type_ids=token_type_ids_batch,
    551                     attention_mask=mask_batch,
--> 552                     labels=None,
    553                 )
    554                 loss = loss_func(y_h, y_batch).mean()

C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\parallel\data_parallel.py in forward(self, *inputs, **kwargs)
    153             return self.module(*inputs[0], **kwargs[0])
    154         replicas = self.replicate(self.module, self.device_ids[:len(inputs)])
--> 155         outputs = self.parallel_apply(replicas, inputs, kwargs)
    156         return self.gather(outputs, self.output_device)
    157 

C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\parallel\data_parallel.py in parallel_apply(self, replicas, inputs, kwargs)
    163 
    164     def parallel_apply(self, replicas, inputs, kwargs):
--> 165         return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
    166 
    167     def gather(self, outputs, output_device):

C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\parallel\parallel_apply.py in parallel_apply(modules, inputs, kwargs_tup, devices)
     83         output = results[i]
     84         if isinstance(output, ExceptionWrapper):
---> 85             output.reraise()
     86         outputs.append(output)
     87     return outputs

C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\_utils.py in reraise(self)
    393             # (https://bugs.python.org/issue2651), so we work around it.
    394             msg = KeyErrorMessage(msg)
--> 395         raise self.exc_type(msg)

StopIteration: Caught StopIteration in replica 0 on device 0.
Original Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\parallel\parallel_apply.py", line 60, in _worker
    output = module(*input, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 989, in forward
    _, pooled_output = self.bert(input_ids, token_type_ids, attention_mask, output_all_encoded_layers=False)
  File "C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\pytorch_pretrained_bert\modeling.py", line 727, in forward
    extended_attention_mask = extended_attention_mask.to(dtype=next(self.parameters()).dtype) # fp16 compatibility
StopIteration

With one GPU the code runs flawlessly, but with 2 GPU’s it doesn’t run.

Please let me know if you need additional information.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
janhavi13commented, Jun 19, 2020

Hi @EVMartinez ,

Apologies for the delayed response. I haven’t had a chance to reproduce the error but I did a quick search in the huggingface repo and it looks it is an issue with the pytorch version (1.5). https://github.com/huggingface/transformers/issues/4189

There are 2 options:

  1. As suggested in the github issue in huggingface either downgrade the version OR
  2. pip install pytorch from source

Note: Feel free to add new issues related to nlp_recipes github repo in that repo!

I will also close this issue soon as it is not related to the explainers itself!

Janhavi

0reactions
EVMartinezcommented, Jun 3, 2020

Thank you for your suggestion @janhavi13 I tried the sample notebook, when executing step 12, it started to use the 2 GPU’s then suddenly stopped and I got the following stack trace:

Downloading: 100%

481/481 [00:00<00:00, 808B/s]



Downloading: 100%

899k/899k [00:02<00:00, 348kB/s]



Downloading: 100%

456k/456k [00:01<00:00, 298kB/s]



Downloading: 100%

501M/501M [00:27<00:00, 18.5MB/s]



---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-13-ace30a6501e8> in <module>
     28     with Timer() as t:
     29         classifier.fit(
---> 30             train_dataloader, num_epochs=NUM_EPOCHS, num_gpus=NUM_GPUS, verbose=False,
     31         )
     32     train_time = t.interval / 3600

c:\users\neo\src\utils-nlp\utils_nlp\models\transformers\sequence_classification.py in fit(self, train_dataloader, num_epochs, max_steps, gradient_accumulation_steps, num_gpus, gpu_ids, local_rank, weight_decay, learning_rate, adam_epsilon, warmup_steps, fp16, fp16_opt_level, checkpoint_state_dict, verbose, seed)
    331             local_rank=local_rank,
    332             verbose=verbose,
--> 333             seed=seed,
    334         )
    335 

c:\users\neo\src\utils-nlp\utils_nlp\models\transformers\common.py in fine_tune(self, train_dataloader, get_inputs, device, num_gpus, max_steps, global_step, max_grad_norm, gradient_accumulation_steps, optimizer, scheduler, fp16, amp, local_rank, verbose, seed, report_every, save_every, clip_grad_norm, validation_function)
    193             for step, batch in enumerate(epoch_iterator):
    194                 inputs = get_inputs(batch, device, self.model_name)
--> 195                 outputs = self.model(**inputs)
    196 
    197                 if isinstance(outputs, tuple):

C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\parallel\data_parallel.py in forward(self, *inputs, **kwargs)
    153             return self.module(*inputs[0], **kwargs[0])
    154         replicas = self.replicate(self.module, self.device_ids[:len(inputs)])
--> 155         outputs = self.parallel_apply(replicas, inputs, kwargs)
    156         return self.gather(outputs, self.output_device)
    157 

C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\parallel\data_parallel.py in parallel_apply(self, replicas, inputs, kwargs)
    163 
    164     def parallel_apply(self, replicas, inputs, kwargs):
--> 165         return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
    166 
    167     def gather(self, outputs, output_device):

C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\parallel\parallel_apply.py in parallel_apply(modules, inputs, kwargs_tup, devices)
     83         output = results[i]
     84         if isinstance(output, ExceptionWrapper):
---> 85             output.reraise()
     86         outputs.append(output)
     87     return outputs

C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\_utils.py in reraise(self)
    393             # (https://bugs.python.org/issue2651), so we work around it.
    394             msg = KeyErrorMessage(msg)
--> 395         raise self.exc_type(msg)

StopIteration: Caught StopIteration in replica 0 on device 0.
Original Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\parallel\parallel_apply.py", line 60, in _worker
    output = module(*input, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\Neo\AppData\Roaming\Python\Python37\site-packages\transformers\modeling_roberta.py", line 344, in forward
    inputs_embeds=inputs_embeds,
  File "C:\ProgramData\Anaconda3\envs\EmilioDL\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\Neo\AppData\Roaming\Python\Python37\site-packages\transformers\modeling_bert.py", line 707, in forward
    attention_mask, input_shape, self.device
  File "C:\Users\Neo\AppData\Roaming\Python\Python37\site-packages\transformers\modeling_utils.py", line 113, in device
    return next(self.parameters()).device
StopIteration
Read more comments on GitHub >

github_iconTop Results From Across the Web

Examples — pytorch-transformers 1.0.0 documentation
How to use gradient-accumulation, multi-gpu training, distributed training, optimize on CPU and 16-bits training to train Bert models. Fine-tuning with ...
Read more >
Multi Class Text Classification With Deep Learning Using BERT
Most of the researchers submit their research papers to academic conference because its a faster way of making the results available.
Read more >
https://openi.pcl.ac.cn/keyam/PanGu-Alpha-GPU/comm...
To demonstrate how the code scales with multiple GPUs we consider the following ... finetunes the BERT model for evaluation with the [MultiNLI...
Read more >
pytorch 实现bert模型- CSDN
LongTensor of shape [batch_size, sequence_length] with the token types indices selected in [0, 1]. Type 0 corresponds to a `sentence A` and type...
Read more >
Accelerate BERT inference with DeepSpeed-Inference on GPUs
2. Load vanilla BERT model and set baseline; 3. Optimize BERT for GPU using DeepSpeed InferenceEngine; 4. Evaluate the performance and speed ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found