question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

mnist nas demo failed (pytorch)

See original GitHub issue

Describe the issue:

I have runed mnist nas example , but failed. And nni cannot use GPU, it seems to be fine on CPU.(pytorch can use GPU) 截屏2022-03-15 22 42 54

here are the output:

/home/qsy/anaconda3/envs/nni/bin/python /home/qsy/桌面/workspace/nas/try1.py
[2022-03-15 22:39:42] INFO (root/MainThread) CUDA available: True
[2022-03-15 22:39:42] INFO (nni.experiment/MainThread) Creating experiment, Experiment ID: pym3cfno
[2022-03-15 22:39:42] INFO (nni.experiment/MainThread) Connecting IPC pipe...
[2022-03-15 22:39:43] INFO (nni.experiment/MainThread) Starting web server...
[2022-03-15 22:39:44] INFO (nni.experiment/MainThread) Setting up...
[2022-03-15 22:39:44] INFO (nni.runtime.msg_dispatcher_base/Thread-3) Dispatcher started
[2022-03-15 22:39:44] INFO (nni.retiarii.experiment.pytorch/MainThread) Web UI URLs: http://127.0.0.1:8167 http://192.168.31.136:8167 http://172.17.0.1:8167
[2022-03-15 22:39:44] INFO (nni.retiarii.experiment.pytorch/MainThread) Start strategy...
[2022-03-15 22:39:44] INFO (root/MainThread) Successfully update searchSpace.
[2022-03-15 22:39:44] INFO (nni.retiarii.strategy.bruteforce/MainThread) Random search running in fixed size mode. Dedup: on.
[2022-03-15 22:42:34] INFO (nni.retiarii.experiment.pytorch/Thread-4) Stopping experiment, please wait...
[2022-03-15 22:42:34] INFO (nni.retiarii.experiment.pytorch/MainThread) Strategy exit
[2022-03-15 22:42:34] INFO (nni.retiarii.experiment.pytorch/MainThread) Waiting for experiment to become DONE (you can ctrl+c if there is no running trial jobs)...
[2022-03-15 22:42:35] INFO (nni.runtime.msg_dispatcher_base/Thread-3) Dispatcher exiting...
[2022-03-15 22:42:35] INFO (nni.retiarii.experiment.pytorch/Thread-4) Experiment stopped
Final model:
[2022-03-15 22:42:37] INFO (nni.runtime.msg_dispatcher_base/Thread-3) Dispatcher terminiated

Environment:

  • NNI version: 2.6.1
  • Training service (local|remote|pai|aml|etc):
  • Client OS:
  • Server OS (for remote mode only):ubuntu20.04
  • Python version: 3.9.7
  • PyTorch/TensorFlow version: PyTorch 1.11.0
  • Is conda/virtualenv/venv used?: conda
  • Is running in Docker?: no
_libgcc_mutex             0.1                        main    defaults
_openmp_mutex             4.5                       1_gnu    defaults
astor                     0.8.1                    pypi_0    pypi
asttokens                 2.0.5              pyhd8ed1ab_0    conda-forge
autopep8                  1.6.0              pyhd3eb1b0_0    defaults
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
blas                      1.0                         mkl    defaults
brotlipy                  0.7.0           py39h27cfd23_1003    defaults
bzip2                     1.0.8                h7b6447c_0    defaults
ca-certificates           2021.10.8            ha878542_0    conda-forge
certifi                   2021.10.8        py39h06a4308_2    defaults
cffi                      1.15.0           py39hd667e15_1    defaults
charset-normalizer        2.0.4              pyhd3eb1b0_0    defaults
cloudpickle               2.0.0                    pypi_0    pypi
colorama                  0.4.4                    pypi_0    pypi
contextlib2               21.6.0                   pypi_0    pypi
cryptography              36.0.0           py39h9ce1e76_0    defaults
cudatoolkit               11.3.1               h2bc3f7f_2    defaults
debugpy                   1.5.1            py39h295c915_0    defaults
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
dill                      0.3.4                    pypi_0    pypi
entrypoints               0.4                pyhd8ed1ab_0    conda-forge
executing                 0.8.3              pyhd8ed1ab_0    conda-forge
ffmpeg                    4.3                  hf484d3e_0    pytorch
filelock                  3.6.0                    pypi_0    pypi
freetype                  2.11.0               h70c0345_0    defaults
future                    0.18.2                   pypi_0    pypi
giflib                    5.2.1                h7b6447c_0    defaults
gmp                       6.2.1                h2531618_2    defaults
gnutls                    3.6.15               he1e5248_0    defaults
hyperopt                  0.1.2                    pypi_0    pypi
idna                      3.3                pyhd3eb1b0_0    defaults
intel-openmp              2021.4.0          h06a4308_3561    defaults
ipykernel                 6.9.2            py39hef51801_0    conda-forge
ipython                   8.1.1            py39hf3d152e_0    conda-forge
jedi                      0.18.1           py39hf3d152e_0    conda-forge
joblib                    1.1.0                    pypi_0    pypi
jpeg                      9d                   h7f8727e_0    defaults
json-tricks               3.15.5                   pypi_0    pypi
jupyter_client            7.1.2              pyhd8ed1ab_0    conda-forge
jupyter_core              4.9.2            py39hf3d152e_0    conda-forge
lame                      3.100                h7b6447c_0    defaults
lcms2                     2.12                 h3be6417_0    defaults
ld_impl_linux-64          2.35.1               h7274673_9    defaults
libffi                    3.3                  he6710b0_2    defaults
libgcc-ng                 9.3.0               h5101ec6_17    defaults
libgomp                   9.3.0               h5101ec6_17    defaults
libiconv                  1.15                 h63c8f33_5    defaults
libidn2                   2.3.2                h7f8727e_0    defaults
libpng                    1.6.37               hbc83047_0    defaults
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libstdcxx-ng              9.3.0               hd4cf53a_17    defaults
libtasn1                  4.16.0               h27cfd23_0    defaults
libtiff                   4.2.0                h85742a9_0    defaults
libunistring              0.9.10               h27cfd23_0    defaults
libuv                     1.40.0               h7b6447c_0    defaults
libwebp                   1.2.2                h55f646e_0    defaults
libwebp-base              1.2.2                h7f8727e_0    defaults
lz4-c                     1.9.3                h295c915_1    defaults
matplotlib-inline         0.1.3              pyhd8ed1ab_0    conda-forge
mkl                       2021.4.0           h06a4308_640    defaults
mkl-service               2.4.0            py39h7f8727e_0    defaults
mkl_fft                   1.3.1            py39hd3c417c_0    defaults
mkl_random                1.2.2            py39h51133e4_0    defaults
ncurses                   6.3                  h7f8727e_2    defaults
nest-asyncio              1.5.4              pyhd8ed1ab_0    conda-forge
nettle                    3.7.3                hbbd107a_1    defaults
networkx                  2.7.1                    pypi_0    pypi
nni                       2.6.1                    pypi_0    pypi
numpy                     1.21.2           py39h20f2e39_0    defaults
numpy-base                1.21.2           py39h79a1101_0    defaults
openh264                  2.1.1                h4ff587b_0    defaults
openssl                   1.1.1m               h7f8727e_0    defaults
pandas                    1.4.1                    pypi_0    pypi
parso                     0.8.3              pyhd8ed1ab_0    conda-forge
pexpect                   4.8.0              pyh9f0ad1d_2    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    9.0.1            py39h22f2fdc_0    defaults
pip                       21.2.4           py39h06a4308_0    defaults
prettytable               3.2.0                    pypi_0    pypi
prompt-toolkit            3.0.27             pyha770c72_0    conda-forge
psutil                    5.9.0                    pypi_0    pypi
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pure_eval                 0.2.2              pyhd8ed1ab_0    conda-forge
pycodestyle               2.8.0              pyhd3eb1b0_0    defaults
pycparser                 2.21               pyhd3eb1b0_0    defaults
pygments                  2.11.2             pyhd8ed1ab_0    conda-forge
pymongo                   4.0.2                    pypi_0    pypi
pyopenssl                 22.0.0             pyhd3eb1b0_0    defaults
pysocks                   1.7.1            py39h06a4308_0    defaults
python                    3.9.7                h12debd9_1    defaults
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.9                      2_cp39    conda-forge
pythonwebhdfs             0.2.3                    pypi_0    pypi
pytorch                   1.11.0          py3.9_cuda11.3_cudnn8.2.0_0    pytorch
pytorch-mutex             1.0                        cuda    pytorch
pytz                      2021.3                   pypi_0    pypi
pyyaml                    6.0                      pypi_0    pypi
pyzmq                     19.0.2           py39hb69f2a1_2    conda-forge
readline                  8.1.2                h7f8727e_1    defaults
requests                  2.27.1             pyhd3eb1b0_0    defaults
responses                 0.19.0                   pypi_0    pypi
schema                    0.7.5                    pypi_0    pypi
scikit-learn              1.0.2                    pypi_0    pypi
scipy                     1.8.0                    pypi_0    pypi
setuptools                58.0.4           py39h06a4308_0    defaults
simplejson                3.17.6                   pypi_0    pypi
six                       1.16.0             pyhd3eb1b0_1    defaults
sqlite                    3.38.0               hc218d9a_0    defaults
stack_data                0.2.0              pyhd8ed1ab_0    conda-forge
threadpoolctl             3.1.0                    pypi_0    pypi
tk                        8.6.11               h1ccaba5_0    defaults
toml                      0.10.2             pyhd3eb1b0_0    defaults
torchaudio                0.11.0               py39_cu113    pytorch
torchvision               0.12.0               py39_cu113    pytorch
tornado                   6.1              py39h3811e60_1    conda-forge
tqdm                      4.63.0                   pypi_0    pypi
traitlets                 5.1.1              pyhd8ed1ab_0    conda-forge
typeguard                 2.13.3                   pypi_0    pypi
typing_extensions         3.10.0.2           pyh06a4308_0    defaults
tzdata                    2021e                hda174b7_0    defaults
urllib3                   1.26.8             pyhd3eb1b0_0    defaults
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
websockets                10.2                     pypi_0    pypi
wheel                     0.37.1             pyhd3eb1b0_0    defaults
xz                        5.2.5                h7b6447c_0    defaults
zeromq                    4.3.4                h9c3ff4c_0    conda-forge
zlib                      1.2.11               h7f8727e_4    defaults
zstd                      1.4.9                haebb681_0    defaults

Configuration:

  • Experiment config (remember to remove secrets!):
  • Search space:

Log message:

  • nnimanager.log:
 sequenceId: 17,
 hyperParameters: {
   value: '{"parameter_id": 18, "parameters": {"class": {"__nni_type__": "bytes:gAWV+g4AAAAAAACMF2Nsb3VkcGlja2xlLmNsb3VkcGlja2xllIwUX21ha2Vfc2tlbGV0b25fY2xhc3OUk5QojANhYmOUjAdBQkNNZXRhlJOUjANOZXSUaAIoaAVoBowVbm5pLmNvbW1vbi5zZXJpYWxpemVylIwSU2VyaWFsaXphYmxlT2JqZWN0lJOUaAIojAhidWlsdGluc5SMBHR5cGWUk5RoBowXdG9yY2gubm4ubW9kdWxlcy5tb2R1bGWUjAZNb2R1bGWUk5SFlH2UjCBmNDVmZWM1MGIxZWM0MThjOTE4ZjNiMGYwZDI4NzE4YZROdJRSlIwcY2xvdWRwaWNrbGUuY2xvdWRwaWNrbGVfZmFzdJSMD19jbGFzc19zZXRzdGF0ZZSTlGgUfZQojApfX21vZHVsZV9flIwIX19tYWluX1+UjAhfX2luaXRfX5RoAIwNX2J1aWx0aW5fdHlwZZSTlIwKTGFtYmRhVHlwZZSFlFKUKGgdjAhDb2RlVHlwZZSFlFKUKEsBSwBLAEsCSwhLA0OOdACDAKABoQABAHQCoANkAWQCZANkAaEEfABfBHQCoAV0AqADZAJkBGQDZAGhBHQGZAJkBIMCZwKhAXwAXwd0AqAIdAKgCWcAZAWiAaEBoQF8AF8KdAKgCGQGoQF8AF8LdAKgCWcAZAeiAaEBfQF0AqAMZAh8AaECfABfDXQCoAx8AWQJoQJ8AF8OZABTAJQoTksBSyBLA0tARz/QAAAAAAAARz/gAAAAAAAARz/oAAAAAAAAh5RHP+AAAAAAAABLQEuATQABh5RNACRLCnSUKIwFc3VwZXKUaBuMAm5ulIwGQ29udjJklIwFY29udjGUjAtMYXllckNob2ljZZSMFkRlcHRod2lzZVNlcGFyYWJsZUNvbnaUjAVjb252MpSMB0Ryb3BvdXSUjAtWYWx1ZUNob2ljZZSMCGRyb3BvdXQxlIwIZHJvcG91dDKUjAZMaW5lYXKUjANmYzGUjANmYzKUdJSMBHNlbGaUjAdmZWF0dXJllIaUjCYvaG9tZS9xc3kv5qGM6Z2iL3dvcmtzcGFjZS9uYXMvdHJ5Mi5weZSMCF9faW5pdF9flEsdQxYAAQoBEgIEAQ4BCP4IBxYBDAEOAg4BlIwJX19jbGFzc19flIWUKXSUUpR9lCiMC19fcGFja2FnZV9flE6MCF9fbmFtZV9flGgajAhfX2ZpbGVfX5RoOnVOTmgAjBBfbWFrZV9lbXB0eV9jZWxslJOUKVKUhZR0lFKUaBWMEl9mdW5jdGlvbl9zZXRzdGF0ZZSTlGhKfZR9lChoQ2g7jAxfX3F1YWxuYW1lX1+UjAxOZXQuX19pbml0X1+UjA9fX2Fubm90YXRpb25zX1+UfZSMDl9fa3dkZWZhdWx0c19flE6MDF9fZGVmYXVsdHNfX5ROaBloGowHX19kb2NfX5ROjAtfX2Nsb3N1cmVfX5RoAIwKX21ha2VfY2VsbJSTlGgUhZRSlIWUjBdfY2xvdWRwaWNrbGVfc3VibW9kdWxlc5RdlGgAjAlzdWJpbXBvcnSUk5SMGm5uaS5yZXRpYXJpaS5ubi5weXRvcmNoLm5ulIWUUpRhjAtfX2dsb2JhbHNfX5R9lChoKWhfjBdubmkucmV0aWFyaWkubm4ucHl0b3JjaJSFlFKUaC1oAihoDGgtaA+FlH2UjCA1OTQ2Mzk2ODBmMzA0YTBiOGVkNjE0YjIyMThmMTM4NpROdJRSlGgXaGx9lChoGWgaaBtoIChoIyhLA0sASwBLA0sGSwNDNHQAgwCgAaEAAQB0AmoDfAF8AWQBfAFkAo0EfABfBHQCagN8AXwCZANkBI0DfABfBWQAUwCUKE5LA4wLa2VybmVsX3NpemWUjAZncm91cHOUhpRLAWhvhZR0lChoKGgbaCloKowJZGVwdGh3aXNllIwJcG9pbnR3aXNllHSUaDeMBWluX2NolIwGb3V0X2NolIeUaDpoO0sSQwYAAQoBFAGUaD2FlCl0lFKUaEFOTmhGKVKUhZR0lFKUaExogX2UfZQoaENoO2hPjB9EZXB0aHdpc2VTZXBhcmFibGVDb252Ll9faW5pdF9flGhRfZRoU05oVE5oGWgaaFVOaFZoWGhshZRSlIWUaFxdlGhiYWhjfZRoKWhnc3WGlIZSMIwHZm9yd2FyZJRoIChoIyhLAksASwBLAksFS0NDEHwAoAB8AKABfAGhAaEBUwCUToWUaHVodIaUaDeMAXiUhpRoOowHZm9yd2FyZJRLF0MCAAGUKSl0lFKUaEFOTk50lFKUaExol32UfZQoaENokmhPjB5EZXB0aHdpc2VTZXBhcmFibGVDb252LmZvcndhcmSUaFF9lGhTTmhUTmgZaBpoVU5oVk5oXF2UaGN9lHWGlIZSMGhVTnV9lIaUhlIwdXWGlIZSMGiMaCAoaCMoSwJLAEsASwJLCUtDQ1R0AKABfACgAnwBoQGhAX0BdACgA3wAoAR8AaEBZAGhAn0BdAWgBnwAoAd8AaEBZAKhAn0BfACgCHwAoAl0AKABfACgCnwBoQGhAaEBoQF9AXwBUwCUTksCSwGHlCiMAUaUjARyZWx1lGgrjAptYXhfcG9vbDJklGgujAV0b3JjaJSMB2ZsYXR0ZW6UaDFoNWgyaDR0lGg3aJCGlGg6aJJLL0MKAAEQARIBEgEcAZQpKXSUUpRoQU5OTnSUUpRoTGivfZR9lChoQ2iSaE+MC05ldC5mb3J3YXJklGhRfZRoU05oVE5oGWgaaFVOaFZOaFxdlGhjfZQoaKRoX4wTdG9yY2gubm4uZnVuY3Rpb25hbJSFlFKUaKdoX2inhZRSlHV1hpSGUjBoVU51fZSGlIZSMIaUfZSMIGY5OWE4YjdhODg2NTRhMDA4ZDljZjZkYjNiNGMzNzAylE50lFKUaBdown2UKGgZaBpoG2ggKGgjKEsBSwBLAEsDSwdLH0MwdACIAWoBfAF8AogDZAFkAo0FXAJ9AX0CdAKDAGoBiAF8AXwCiAJkA40EAQBkAFMAlChOiIwNaXNfY2xhc3NfaW5pdJSFlCiMBnN5bWJvbJSMBGFyZ3OUjAZrd2FyZ3OUjApjYWxsX3N1cGVylHSUdJSMFF9mb3JtdWxhdGVfYXJndW1lbnRzlGgbaCiHlGg3aMhoyYeUjFEvaG9tZS9xc3kvYW5hY29uZGEzL2VudnMvbm5pL2xpYi9weXRob24zLjkvc2l0ZS1wYWNrYWdlcy9ubmkvY29tbW9uL3NlcmlhbGl6ZXIucHmUaDtNaAFDBAACGAOUKGg9jARiYXNllGjKjAdrd19vbmx5lHSUKXSUUpR9lChoQowKbm5pLmNvbW1vbpRoQ4wVbm5pLmNvbW1vbi5zZXJpYWxpemVylGhEjFEvaG9tZS9xc3kvYW5hY29uZGEzL2VudnMvbm5pL2xpYi9weXRob24zLjkvc2l0ZS1wYWNrYWdlcy9ubmkvY29tbW9uL3NlcmlhbGl6ZXIucHmUdU5OKGhGKVKUaEYpUpRoRilSlGhGKVKUdJR0lFKUaExo4X2UfZQoaENoO2hPjCRfdHJhY2VfY2xzLjxsb2NhbHM+LndyYXBwZXIuX19pbml0X1+UaFF9lGhTTmhUTmgZaNloVU5oVihoWGjChZRSlGhYaBSFlFKUaFiIhZRSlGhYiIWUUpR0lGhcXZRoY32UaM1oB2jNk5RzdYaUhlIwaFVOjBNfX2Fic3RyYWN0bWV0aG9kc19flCiRlGhRfZQojAxkdW1wX3BhdGNoZXOUaAqMBGJvb2yUk5SMCF92ZXJzaW9ulGgKjANpbnSUk5SMCHRyYWluaW5nlGj4jBZfaXNfZnVsbF9iYWNrd2FyZF9ob29rlIwJX29wZXJhdG9ylIwHZ2V0aXRlbZSTlIwGdHlwaW5nlIwFVW5pb26Uk5Ro+GgMToWUUpSGlIaUUpRokmoAAQAAagEBAACMCENhbGxhYmxllJOUaAqMCEVsbGlwc2lzlJOUagEBAACMA0FueZSTlIaUhpRSlIwIX19jYWxsX1+UahEBAAB1jAtfX3dyYXBwZWRfX5RoFIwHX3RyYWNlZJSIjAlfYWJjX2ltcGyUXZR1fZSGlIZSMIWUfZSMIDQ2OTg0ZTg4MjBjYzQxYmM5NDU2ZWQ5MWQ4NDJiYzkzlE50lFKUaBdqHQEAAH2UKGgZjAhfX21haW5fX5RoG2ggKGgjKEsBSwBLAEsDSwhLH0M8dACDAI8iAQB0AYMAagJ8AWkAfAKkAY4BAQBXAGQABAAEAIMDAQBuEDEAcy4wAAEAAQABAFkAAQBkAFMAlE6FlIwOTW9kZWxOYW1lc3BhY2WUaChoG4eUaDdoyGjJh5SMUy9ob21lL3FzeS9hbmFjb25kYTMvZW52cy9ubmkvbGliL3B5dGhvbjMuOS9zaXRlLXBhY2thZ2VzL25uaS9yZXRpYXJpaS9zZXJpYWxpemVyLnB5lGg7S3JDBAABCAGUaD2FlCl0lFKUfZQoaEKMDG5uaS5yZXRpYXJpaZRoQ4wXbm5pLnJldGlhcmlpLnNlcmlhbGl6ZXKUaESMUy9ob21lL3FzeS9hbmFjb25kYTMvZW52cy9ubmkvbGliL3B5dGhvbjMuOS9zaXRlLXBhY2thZ2VzL25uaS9yZXRpYXJpaS9zZXJpYWxpemVyLnB5lHVOTmhGKVKUhZR0lFKUaExqMQEAAH2UfZQoaENoO2hPjC1tb2RlbF93cmFwcGVyLjxsb2NhbHM+LnJlc2V0X3dyYXBwZXIuX19pbml0X1+UaFF9lGhTTmhUTmgZaiwBAABoVU5oVmhYah0BAACFlFKUhZRoXF2UaGN9lGoiAQAAjBJubmkucmV0aWFyaWkudXRpbHOUaiIBAACTlHN1hpSGUjBo8yiRlGhRfZQojAxkdW1wX3BhdGNoZXOUaPiMCF92ZXJzaW9ulGj7jAh0cmFpbmluZ5Ro+IwWX2lzX2Z1bGxfYmFja3dhcmRfaG9va5RqCAEAAIwHZm9yd2FyZJRqEQEAAIwIX19jYWxsX1+UahEBAAB1jBJfbm5pX21vZGVsX3dyYXBwZXKUiGoVAQAAXZR1fZSGlIZSMC4="}, "init_parameters": {}, "mutation": {"model_1": "0", "model_2": 0.75, "model_3": 256}, "evaluator": {"type": {"__nni_type__": "path:nni.retiarii.evaluator.functional.FunctionalEvaluator"}, "function": {"__nni_type__": "bytes:gAWVjwkAAAAAAACMF2Nsb3VkcGlja2xlLmNsb3VkcGlja2xllIwNX2J1aWx0aW5fdHlwZZSTlIwKTGFtYmRhVHlwZZSFlFKUKGgCjAhDb2RlVHlwZZSFlFKUKEsBSwBLAEsJSwdLQ0PEfACDAH0BdABqAWoCfAGgA6EAZAFkAo0CfQJ0BKAFdASgBqEAdASgB2QDZAShAmcCoQF9A3QIdAlkBWQGfANkB40DZAhkBmQJjQN9BHQIdAlkBWQGZAp8A2QLjQRkCGQMjQJ9BXQAagqgC6EAcnh0AKAMZA2hAW4IdACgDGQOoQF9BnQNZA+DAUQAXSp9B3QOfAF8BnwEfAJ8B4MFAQB0D3wBfAZ8BYMDfQh0EKARfAihAQEAcYp0EKASfAihAQEAZABTAJQoTkc/UGJN0vGp/IwCbHKUhZRHP8C6xxDLKV+FlEc/07fpD/lyR4WUjApkYXRhL21uaXN0lIiMCGRvd25sb2FklIwJdHJhbnNmb3JtlIaUS0CMCmJhdGNoX3NpemWUjAdzaHVmZmxllIaUiWgPjAV0cmFpbpRoEIeUaBKFlIwEY3VkYZSMA2NwdZRLA3SUKIwFdG9yY2iUjAVvcHRpbZSMBEFkYW2UjApwYXJhbWV0ZXJzlIwKdHJhbnNmb3Jtc5SMB0NvbXBvc2WUjAhUb1RlbnNvcpSMCU5vcm1hbGl6ZZSMCkRhdGFMb2FkZXKUjAVNTklTVJRoGIwMaXNfYXZhaWxhYmxllIwGZGV2aWNllIwFcmFuZ2WUjAt0cmFpbl9lcG9jaJSMCnRlc3RfZXBvY2iUjANubmmUjBpyZXBvcnRfaW50ZXJtZWRpYXRlX3Jlc3VsdJSME3JlcG9ydF9maW5hbF9yZXN1bHSUdJQojAltb2RlbF9jbHOUjAVtb2RlbJSMCW9wdGltaXplcpSMBnRyYW5zZpSMDHRyYWluX2xvYWRlcpSMC3Rlc3RfbG9hZGVylGgmjAVlcG9jaJSMCGFjY3VyYWN5lHSUjCYvaG9tZS9xc3kv5qGM6Z2iL3dvcmtzcGFjZS9uYXMvdHJ5Mi5weZSMDmV2YWx1YXRlX21vZGVslEtcQxYAAgYCFAEaARgBGAIeAgwCEAIMAgwDlCkpdJRSlH2UKIwLX19wYWNrYWdlX1+UTowIX19uYW1lX1+UjAhfX21haW5fX5SMCF9fZmlsZV9flGg3dU5OTnSUUpSMHGNsb3VkcGlja2xlLmNsb3VkcGlja2xlX2Zhc3SUjBJfZnVuY3Rpb25fc2V0c3RhdGWUk5RoQn2UfZQoaD5oOIwMX19xdWFsbmFtZV9flGg4jA9fX2Fubm90YXRpb25zX1+UfZSMDl9fa3dkZWZhdWx0c19flE6MDF9fZGVmYXVsdHNfX5ROjApfX21vZHVsZV9flGg/jAdfX2RvY19flE6MC19fY2xvc3VyZV9flE6MF19jbG91ZHBpY2tsZV9zdWJtb2R1bGVzlF2UKGgAjAlzdWJpbXBvcnSUk5SMCnRvcmNoLmN1ZGGUhZRSlGhTjAt0b3JjaC5vcHRpbZSFlFKUaFOMIXRvcmNodmlzaW9uLnRyYW5zZm9ybXMudHJhbnNmb3Jtc5SFlFKUZYwLX19nbG9iYWxzX1+UfZQoaBtoU2gbhZRSlGgfaFOMFnRvcmNodmlzaW9uLnRyYW5zZm9ybXOUhZRSlGgjjBt0b3JjaC51dGlscy5kYXRhLmRhdGFsb2FkZXKUaCOTlGgkjBp0b3JjaHZpc2lvbi5kYXRhc2V0cy5tbmlzdJRoJJOUaChoBShoCChLBUsASwBLC0sKS0NDrHQAagGgAqEAfQV8AKADoQABAHQEfAKDAUQAXYxcAn0GXAJ9B30IfAegBXwBoQF8CKAFfAGhAQIAfQd9CHwDoAahAAEAfAB8B4MBfQl8BXwJfAiDAn0KfAqgB6EAAQB8A6AIoQABAHwGZAEWAGQCawJyGnQJZAOgCnwEfAZ0C3wHgwEUAHQLfAJqDIMBZAR8BhQAdAt8AoMBGwB8CqANoQChBYMBAQBxGmQAUwCUKE5LCksAjC5UcmFpbiBFcG9jaDoge30gW3t9L3t9ICh7Oi4wZn0lKV0JTG9zczogezouNmZ9lEdAWQAAAAAAAHSUKGgbjAJubpSMEENyb3NzRW50cm9weUxvc3OUaBWMCWVudW1lcmF0ZZSMAnRvlIwJemVyb19ncmFklIwIYmFja3dhcmSUjARzdGVwlIwFcHJpbnSUjAZmb3JtYXSUjANsZW6UjAdkYXRhc2V0lIwEaXRlbZR0lChoL2gmaDJoMGg0jAdsb3NzX2ZulIwJYmF0Y2hfaWR4lIwEZGF0YZSMBnRhcmdldJSMBm91dHB1dJSMBGxvc3OUdJRoN2goSzdDHAABCgEIARQBFgEIAQgBCgEIAQgBDAEGARQBFP6UKSl0lFKUaDxOTk50lFKUaEVog32UfZQoaD5oKGhIaChoSX2UaEtOaExOaE1oP2hOTmhPTmhQXZRoU4wIdG9yY2gubm6UhZRSlGFoXX2UaBtoYHN1hpSGUjBoKWgFKGgIKEsDSwBLAEsKSwhLQ0PKfACgAKEAAQBkAX0DZAF9BHQBoAKhAI9mAQB8AkQAXVBcAn0FfQZ8BaADfAGhAXwGoAN8AaEBAgB9BX0GfAB8BYMBfQd8B2oEZAJkA2QEjQJ9CHwEfAigBXwGoAZ8CKEBoQGgB6EAoAihADcAfQRxHlcAZAAEAAQAgwMBAG4QMQBzhDAAAQABAAEAWQABAHwDdAl8AmoKgwEdAH0DZAV8BBQAdAl8AmoKgwEbAH0JdAtkBqAMfAR0CXwCagqDAXwJoQODAQEAfAlTAJQoTksASwGIjANkaW2UjAdrZWVwZGltlIaUR0BZAAAAAAAAjCUKVGVzdCBzZXQ6IEFjY3VyYWN5OiB7fS97fSAoezouMGZ9JSkKlHSUKIwEZXZhbJRoG4wHbm9fZ3JhZJRobowGYXJnbWF4lIwCZXGUjAd2aWV3X2FzlIwDc3VtlGh2aHRodWhyaHN0lChoL2gmaDOMCXRlc3RfbG9zc5SMB2NvcnJlY3SUaHpoe2h8jARwcmVklGg1dJRoN2gpS0dDHgABCAEEAQQBCgEMARYBCAEOATwCDgISAgYBDP8GA5QpKXSUUpRoPE5OTnSUUpRoRWiifZR9lChoPmgpaEhoKWhJfZRoS05oTE5oTWg/aE5OaE9OaFBdlGhdfZRoG2hgc3WGlIZSMGgqaFNoKoWUUpR1dYaUhlIwLg=="}, "arguments": {}}, "mutation_summary": {"model_1": "0", "model_2": 0.75, "model_3": 256}}, "parameter_source": "algorithm", "placement_constraint": {"type": "None", "gpus": []}}',
   index: 0
 },
 placementConstraint: { type: 'None', gpus: [] }
}
[2022-03-15 23:04:59] INFO (NNIManager) Trial job o1iuB status changed from WAITING to RUNNING
[2022-03-15 23:05:04] INFO (NNIManager) Trial job nWO9h status changed from RUNNING to FAILED
[2022-03-15 23:05:09] INFO (NNIManager) Trial job o1iuB status changed from RUNNING to FAILED
  • dispatcher.log:
[2022-03-15 23:01:50] INFO (nni.experiment/MainThread) Creating experiment, Experiment ID: mjcsohik
[2022-03-15 23:01:50] INFO (nni.experiment/MainThread) Connecting IPC pipe...
[2022-03-15 23:01:51] INFO (nni.experiment/MainThread) Starting web server...
[2022-03-15 23:01:52] INFO (nni.experiment/MainThread) Setting up...
[2022-03-15 23:01:52] INFO (nni.runtime.msg_dispatcher_base/Thread-3) Dispatcher started
[2022-03-15 23:01:52] INFO (nni.retiarii.experiment.pytorch/MainThread) Web UI URLs: http://127.0.0.1:8095 http://192.168.31.136:8095 http://172.17.0.1:8095
[2022-03-15 23:01:52] INFO (nni.retiarii.experiment.pytorch/MainThread) Start strategy...
[2022-03-15 23:01:52] INFO (root/MainThread) Successfully update searchSpace.
[2022-03-15 23:01:52] INFO (nni.retiarii.strategy.bruteforce/MainThread) Random search running in fixed size mode. Dedup: on.
[2022-03-15 23:04:50] INFO (nni.retiarii.experiment.pytorch/MainThread) Strategy exit
[2022-03-15 23:04:50] INFO (nni.retiarii.experiment.pytorch/MainThread) Waiting for experiment to become DONE (you can ctrl+c if there is no running trial jobs)...
  • nnictl stdout and stderr:
$ nnictl log stdout
ERROR: /home/qsy/nni-experiments/mjcsohik/log/nnictl_stdout.log does not exist!
$ nnictl log stderr
ERROR: /home/qsy/nni-experiments/mjcsohik/log/nnictl_stderr.log does not exist!

How to reproduce it?:

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
uniartisancommented, Mar 17, 2022

Click FAILED in the trials table and view the logs

I have found the problem. Model has not been synced to device in this code, and a pull request will fix the problem. https://github.com/microsoft/nni/pull/4652

0reactions
ultmastercommented, Sep 9, 2022

Closing as the problem has been fixed.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Multi-Objective NAS with Ax - PyTorch
In this tutorial, we show how to use Ax to run multi-objective neural architecture search (NAS) for a simple neural network model on...
Read more >
DCGAN Tutorial - Generative Adversarial Networks - PyTorch
This tutorial will give an introduction to DCGANs through an example. We will train a generative adversarial network (GAN) to generate new celebrities...
Read more >
Using the PyTorch C++ Frontend
This tutorial will walk you through an end-to-end example of training a model with the C++ frontend. Concretely, we will be training a...
Read more >
Getting Started with Distributed Data Parallel - PyTorch
DistributedDataParallel (DDP) implements data parallelism at the module level which can run across multiple machines. Applications using DDP should spawn ...
Read more >
Deep Learning with PyTorch: A 60 Minute Blitz
... Multi-Objective NAS with Ax · torch.compile Tutorial ... An error occurred while retrieving sharing information. ... An error occurred.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found