AssertionError using multivariate TPE sampler
See original GitHub issueExpected behavior
Script was running normally. Completed trial 18, but then got an AssertionError. This never happened before while running the same script. I previously ran the same script for 36 trials until it stopped due to a CUDA out of memory error, but that’s not relevant here.
Environment
- Optuna version: 2.5.0
- Python version: 3.8.5
- OS: Red Hat Enterprise Linux
- (Optional) Other libraries and their versions:
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
alembic 1.5.4 <pip>
attrs 20.3.0 <pip>
blas 1.0 mkl
ca-certificates 2020.12.5 ha878542_0 conda-forge
certifi 2020.12.5 py38h578d9bd_1 conda-forge
cliff 3.6.0 <pip>
cmaes 0.8.0 <pip>
cmd2 1.5.0 <pip>
colorama 0.4.4 <pip>
colorlog 4.7.2 <pip>
cudatoolkit 10.2.89 hfd86e86_1
freetype 2.10.4 h5ab3b9f_0
intel-openmp 2020.2 254
joblib 1.0.0 pyhd8ed1ab_0 conda-forge
jpeg 9b h024ee3a_2
lcms2 2.11 h396b838_0
ld_impl_linux-64 2.33.1 h53a641e_7
libblas 3.9.0 1_h86c2bf4_netlib conda-forge
libcblas 3.9.0 3_h92ddd45_netlib conda-forge
libedit 3.1.20191231 h14c3975_1
libffi 3.3 he6710b0_2
libgcc-ng 9.3.0 h5dbcf3e_17 conda-forge
libgfortran-ng 9.3.0 he4bcb1c_17 conda-forge
libgfortran5 9.3.0 he4bcb1c_17 conda-forge
libgomp 9.3.0 h5dbcf3e_17 conda-forge
liblapack 3.9.0 3_h92ddd45_netlib conda-forge
libpng 1.6.37 hbc83047_0
libstdcxx-ng 9.3.0 h2ae2ef3_17 conda-forge
libtiff 4.1.0 h2733197_1
libuv 1.40.0 h7b6447c_0
lz4-c 1.9.2 heb0550a_3
Mako 1.1.4 <pip>
MarkupSafe 1.1.1 <pip>
mkl 2020.2 256
mkl-service 2.3.0 py38he904b0f_0
mkl_fft 1.2.0 py38h23d657b_0
mkl_random 1.1.1 py38h0573a6f_0
ncurses 6.2 he6710b0_1
ninja 1.10.2 py38hff7bd54_0
numpy 1.19.2 py38h54aff64_0
numpy-base 1.19.2 py38hfa32c7d_0
olefile 0.46 py_0
openssl 1.1.1i h7f98852_0 conda-forge
optuna 2.5.0 <pip>
packaging 20.9 <pip>
pandas 1.2.0 py38ha9443f7_0
pbr 5.5.1 <pip>
pillow 8.1.0 py38he98fc37_0
pip 20.3.3 py38h06a4308_0
prettytable 0.7.2 <pip>
pyparsing 2.4.7 <pip>
pyperclip 1.8.1 <pip>
python 3.8.5 h7579374_1
python-dateutil 2.8.1 py_0
python-editor 1.0.4 <pip>
python_abi 3.8 1_cp38 conda-forge
pytorch 1.7.1 py3.8_cuda10.2.89_cudnn7.6.5_0 pytorch
pytz 2020.5 pyhd3eb1b0_0
PyYAML 5.4.1 <pip>
readline 8.0 h7b6447c_0
scikit-learn 0.24.0 py38h658cfdd_0 conda-forge
scipy 1.6.0 py38hb2138dd_0 conda-forge
setuptools 51.0.0 py38h06a4308_2
six 1.15.0 py38h06a4308_0
SQLAlchemy 1.3.23 <pip>
sqlite 3.33.0 h62c20be_0
stevedore 3.3.0 <pip>
threadpoolctl 2.1.0 pyh5ca1d4c_0 conda-forge
tk 8.6.10 hbc83047_0
torchaudio 0.7.2 py38 pytorch
torchvision 0.8.2 py38_cu102 pytorch
tqdm 4.56.0 <pip>
typing_extensions 3.7.4.3 py_0
wcwidth 0.2.5 <pip>
wheel 0.36.2 pyhd3eb1b0_0
xz 5.2.5 h7b6447c_0
zlib 1.2.11 h7b6447c_3
zstd 1.4.5 h9ceee32_0
Error messages, stack traces, or logs
Here is everything starting from trial 15:
[I 2021-02-05 07:29:35,802] Trial 15 pruned.
Average Validation Loss: 2594.3800
Epoch 1/5
Average Training loss: 11902.6517
Average Validation Loss: 10585.8529
Epoch 2/5
Average Training loss: 8536.1782
[I 2021-02-05 07:44:59,435] Trial 16 pruned.
Average Validation Loss: 6230.1825
Epoch 1/5
Average Training loss: 605.1005
Average Validation Loss: 532.8016
Epoch 2/5
Average Training loss: 526.2966
Average Validation Loss: 524.1667
Epoch 3/5
Average Training loss: 524.6881
Average Validation Loss: 526.1336
Epoch 4/5
Average Training loss: 524.3555
[I 2021-02-05 08:16:04,120] Trial 17 pruned.
Average Validation Loss: 523.4189
Epoch 1/5
Average Training loss: 1805.4577
Average Validation Loss: 1617.5258
Epoch 2/5
Average Training loss: 1586.4583
Average Validation Loss: 1567.8128
Epoch 3/5
Average Training loss: 1567.9229
Average Validation Loss: 1565.4937
Epoch 4/5
Average Training loss: 1566.8196
Average Validation Loss: 1564.4235
Epoch 5/5
Average Training loss: 1565.9849
[I 2021-02-05 08:54:46,640] Trial 18 finished with value: 1565.4634710145326 and parameters: {'num_channels': 64, 'optimizer': 'RMSprop', 'lr': 2.618149026594371e-05, 'momentum': 0.8016042722596308, 'rmsprop_alpha': 0.902569807347318, 'batch_size': 8, 'loss_func_alpha': 3.0}. Best is trial 9 with value: 519.5251754050875.
Average Validation Loss: 1565.4635
Traceback (most recent call last):
File "recon_hp_tuning.py", line 258, in <module>
study.optimize(objective,
File "/afs/unity.ncsu.edu/users/m/maabdelk/.conda/envs/local/lib/python3.8/site-packages/optuna/study.py", line 376, in optimize
File "/afs/unity.ncsu.edu/users/m/maabdelk/.conda/envs/local/lib/python3.8/site-packages/optuna/_optimize.py", line 63, in _optimize
File "/afs/unity.ncsu.edu/users/m/maabdelk/.conda/envs/local/lib/python3.8/site-packages/optuna/_optimize.py", line 164, in _optimize_sequential
File "/afs/unity.ncsu.edu/users/m/maabdelk/.conda/envs/local/lib/python3.8/site-packages/optuna/_optimize.py", line 191, in _run_trial
File "/afs/unity.ncsu.edu/users/m/maabdelk/.conda/envs/local/lib/python3.8/site-packages/optuna/study.py", line 421, in ask
File "/afs/unity.ncsu.edu/users/m/maabdelk/.conda/envs/local/lib/python3.8/site-packages/optuna/trial/_trial.py", line 57, in __init__
File "/afs/unity.ncsu.edu/users/m/maabdelk/.conda/envs/local/lib/python3.8/site-packages/optuna/trial/_trial.py", line 66, in _init_relative_params
File "/afs/unity.ncsu.edu/users/m/maabdelk/.conda/envs/local/lib/python3.8/site-packages/optuna/samplers/_tpe/sampler.py", line 238, in sample_relative
File "/afs/unity.ncsu.edu/users/m/maabdelk/.conda/envs/local/lib/python3.8/site-packages/optuna/samplers/_tpe/sampler.py", line 830, in _get_multivariate_observation_pairs
AssertionError
The AssertionError happens on line 830 here:
Steps to reproduce
I don’t think I can reproduce this since it was random.
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (4 by maintainers)
Top Results From Across the Web
optuna.samplers.TPESampler - Read the Docs
The sampling algorithm decomposes the search space based on past trials and samples from the joint distribution in each decomposed subspace.
Read more >sigopt.Connection Example - Program Talk
None except(AssertionError): logger.error("`sigopt_api_token` field in yaml ... default TPE option gets better pareto-front instead of multivariate=True, ...
Read more >vocab.txt - Hugging Face
... op number pos ##ap writ whi use code match ##act conn curr ##rect isinstance ... ##ible container mask password comput yield sample...
Read more >"Multivariate" TPE Makes Optuna Even More Powerful
# We use the multivariate TPE sampler. sampler = optuna.samplers.TPESampler(multivariate = True ).
Read more >wordlist_with_underscores.txt - Index of /
... grabusercontest cpuid openerp-wsgi spyglass-sample shuju aws-java-sdk-efs ... ciphers 50-its-google-trends lyrics using__main__ 5th deadlands sess_test ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

I think this issue can be closed as the corresponding PR is merged.
In brief, you have three options.
multivariate(the easiest workaround)