Error on GCP cluster using sqlite (sqlite3.OperationalError: too many SQL variables)
See original GitHub issueI am running optuna to optimize large-scale detailed models of cortical circuits on Google Cloud supercomputers. I integrated optuna within our modeling tool NetPyNE: https://github.com/Neurosim-lab/netpyne/blob/optuna/netpyne/batch/optuna_parallel.py
I was running 50 parallel processes on the controller node of a Google Cloud Platform (GCP) Slurm-based cluster. Each process is an optuna instance (run via ‘screen’) where the objective function submits a Slurm job to run a cortical simulation on 96-core compute job. Optuna was set up so the 50 processes exchange info via an sqlite db file:
study = optuna.create_study(study_name=self.batchLabel, storage='sqlite:///%s/%s_storage.db' % (self.saveFolder, self.batchLabel), load_if_exists=True, direction=args['direction'])
study.optimize(lambda trial: objective(trial, args), n_trials=args['maxiters'], timeout=args['maxtime'])
At around trial 1000 I got the error sqlite3.OperationalError: too many SQL variables
on all processes (see detailed error below).
I know that the doc suggests using PostgreSQL or MySQL for large distributed optimization, particularly if using NFS drives (as do the compute nodes in the cluster). Unfortunately, I have no idea how to replace SQLite with PostgresSQL or MySQL. My question is, do you have any examples of how to set that up? Or have any other suggestions on how to fix this error?
Thanks!
Full error: `[I 2020-06-30 00:53:40,502] Finished trial#996 with value: 409.0361137631617 with parameters: {‘EEGain’: 1.2190444451070857, ‘EIGain’: 1.810497756045793, “(‘IELayerGain’, ‘1-3’)”: 1.087901880175065, “(‘IELayerGain’, ‘4’)”: 1.7309720923711636, “(‘IELayerGain’, ‘5’)”: 0.5960648412842655, “(‘IELayerGain’, ‘6’)”: 0.7978043150339824, “(‘IILayerGain’, ‘1-3’)”: 1.1688903007202631, “(‘IILayerGain’, ‘4’)”: 0.672447638702098, “(‘IILayerGain’, ‘5’)”: 0.8073360248421282, “(‘IILayerGain’, ‘6’)”: 1.8855740649714237, ‘thalamoCorticalGain’: 1.08285373691784, ‘intraThalamicGain’: 1.7758770967297166, ‘corticoThalamicGain’: 1.6566204337609691}. Best is trial#375 with value: 294.2097451307643. [W 2020-06-30 00:53:40,703] Setting status of trial#1046 as TrialState.FAIL because of the following error: OperationalError(‘(sqlite3.OperationalError) too many SQL variables’,) Traceback (most recent call last): File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/base.py”, line 1278, in _execute_context cursor, statement, parameters, context File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/default.py”, line 593, in do_execute cursor.execute(statement, parameters) sqlite3.OperationalError: too many SQL variables
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File “/home/ext_salvadordura_gmail_com/.local/lib/python3.6/site-packages/optuna/study.py”, line 734, in _run_trial result = func(trial) File “/home/ext_salvadordura_gmail_com/netpyne/netpyne/batch/optuna_parallel.py”, line 387, in <lambda> study.optimize(lambda trial: objective(trial, args), n_trials=args[‘maxiters’], timeout=args[‘maxtime’]) File “/home/ext_salvadordura_gmail_com/netpyne/netpyne/batch/optuna_parallel.py”, line 130, in objective candidate.append(trial.suggest_uniform(str(paramLabel), minVal, maxVal)) File “/home/ext_salvadordura_gmail_com/.local/lib/python3.6/site-packages/optuna/trial/_trial.py”, line 221, in suggest_uniform return self._suggest(name, distribution) File “/home/ext_salvadordura_gmail_com/.local/lib/python3.6/site-packages/optuna/trial/_trial.py”, line 650, in _suggest param_value = self.study.sampler.sample_independent(study, trial, name, distribution) File “/home/ext_salvadordura_gmail_com/.local/lib/python3.6/site-packages/optuna/samplers/tpe/sampler.py”, line 174, in sample_independent values, scores = _get_observation_pairs(study, param_name, trial) File “/home/ext_salvadordura_gmail_com/.local/lib/python3.6/site-packages/optuna/samplers/tpe/sampler.py”, line 618, in _get_observation_pairs for trial in study.get_trials(deepcopy=False): File “/home/ext_salvadordura_gmail_com/.local/lib/python3.6/site-packages/optuna/study.py”, line 145, in get_trials return self._storage.get_all_trials(self._study_id, deepcopy=deepcopy) File “/home/ext_salvadordura_gmail_com/.local/lib/python3.6/site-packages/optuna/storages/cached_storage.py”, line 363, in get_all_trials study_id, excluded_trial_ids=study.owned_or_finished_trial_ids File “/home/ext_salvadordura_gmail_com/.local/lib/python3.6/site-packages/optuna/storages/rdb/storage.py”, line 951, in _get_trials models.TrialModel.study_id == study_id, File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/orm/query.py”, line 3341, in all return list(self) File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/orm/query.py”, line 3503, in iter return self._execute_and_instances(context) File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/orm/query.py”, line 3528, in _execute_and_instances result = conn.execute(querycontext.statement, self._params) File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/base.py”, line 1014, in execute return meth(self, multiparams, params) File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/sql/elements.py”, line 298, in _execute_on_connection return connection._execute_clauseelement(self, multiparams, params) File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/base.py”, line 1133, in _execute_clauseelement distilled_params, File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/base.py”, line 1318, in execute_context e, statement, parameters, cursor, context File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/base.py”, line 1512, in handle_dbapi_exception sqlalchemy_exception, with_traceback=exc_info[2], from=e File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/util/compat.py”, line 178, in raise raise exception File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/base.py”, line 1278, in _execute_context cursor, statement, parameters, context File “/usr/local/lib64/python3.6/site-packages/sqlalchemy/engine/default.py”, line 593, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) too many SQL variables [SQL: SELECT trials.trial_id AS trials_trial_id, trials.number AS trials_number, trials.study_id AS trials_study_id, trials.state AS trials_state, trials.value AS trials_value, trials.datetime_start AS trials_datetime_start, trials.datetime_complete AS trials_datetime_complete FROM trials WHERE trials.trial_id NOT IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) AND trials.study_id = ?] [parameters: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 754, 755`
Issue Analytics
- State:
- Created 3 years ago
- Comments:16 (7 by maintainers)
Top GitHub Comments
Thanks for the detailed report, and great, maybe we can proceed with the hot fix.
I also reproduced the issue/verified the fix by building sqlite3 from source with a lower
SQLITE_MAX_VARIABLE_NUMBER
(as seen below) (https://gist.github.com/hvy/3bf60580d810597fcf15bda3f5e6447a) but you are also right about the issue being Linux distribution dependent, c.f. https://bugzilla.redhat.com/show_bug.cgi?id=1798134. You can actually reproduce it on e.g. a mac by running 500000 trials (or creating a query with that many variables in some other way). So the cause was rather simple and it really just boiled down to how sqlite3 was compiled. I’ll see if I can come up with decent unit tests for the fix.Note:
sqlite3
is part of the Python standard library, but it dynamically links to a sqlite3 shared object when being imported which can be swapped out to any other version.The hot fix seems to be working as it’s already on trial 4490… thanks!