StorageInternalError when using pruning
See original GitHub issueI am having trouble running a study, after a couple (10 to 20) iterations it stops and raises an StorageInternalError
. I have already run another similar study on the same file with no issues. I saw that there are a couple of closed issues regarding this same error but they don’t seem to be the same problem and I couldn’t figure out how to fix it. If anyone could give some pointers to understand the raised error and how to fix it, I would greatly appreciate it.
Edit: It seems the source of the error is an OperationalError
described by sqlalchemy doc as:
Exception raised for errors that are related to the database’s operation and not necessarily under the control of the programmer, e.g. an unexpected disconnect occurs, the data source name is not found, a transaction could not be processed, a memory allocation error occurred during processing, etc.
but I am working on a local file and the disk still has plenty of space. What could be causing this?
Environment
- Optuna version: 2.8.0
- Python version: 3.9.5
- OS: Windows-10-10.0.19043-SP0
Error messages, stack traces, or logs
Log
---------------------------------------------------------------------------
OperationalError Traceback (most recent call last)
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in _wrap_pool_connect(self, fn, connection)
3211 try:
-> 3212 return fn()
3213 except dialect.dbapi.Error as e:
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in connect(self)
306 """
--> 307 return _ConnectionFairy._checkout(self)
308
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in _checkout(cls, pool, threadconns, fairy)
766 if not fairy:
--> 767 fairy = _ConnectionRecord.checkout(pool)
768
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in checkout(cls, pool)
424 def checkout(cls, pool):
--> 425 rec = pool._do_get()
426 try:
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\impl.py in _do_get(self)
255 def _do_get(self):
--> 256 return self._create_connection()
257
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in _create_connection(self)
252
--> 253 return _ConnectionRecord(self)
254
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in __init__(self, pool, connect)
367 if connect:
--> 368 self.__connect()
369 self.finalize_callback = deque()
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in __connect(self)
610 with util.safe_reraise():
--> 611 pool.logger.debug("Error on connect(): %s", e)
612 else:
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\util\langhelpers.py in __exit__(self, type_, value, traceback)
69 if not self.warn_only:
---> 70 compat.raise_(
71 exc_value,
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\util\compat.py in raise_(***failed resolving arguments***)
206 try:
--> 207 raise exception
208 finally:
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in __connect(self)
604 self.starttime = time.time()
--> 605 connection = pool._invoke_creator(self)
606 pool.logger.debug("Created new connection %r", connection)
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\create.py in connect(connection_record)
577 return connection
--> 578 return dialect.connect(*cargs, **cparams)
579
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\default.py in connect(self, *cargs, **cparams)
583 # inherits the docstring from interfaces.Dialect.connect
--> 584 return self.dbapi.connect(*cargs, **cparams)
585
OperationalError: unable to open database file
The above exception was the direct cause of the following exception:
OperationalError Traceback (most recent call last)
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_rdb\storage.py in _create_scoped_session(scoped_session, ignore_integrity_error)
52 try:
---> 53 yield session
54 session.commit()
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_rdb\storage.py in _update_trial(self, trial_id, state, values, intermediate_values, params, distributions_, user_attrs, system_attrs, datetime_start, datetime_complete)
667 with _create_scoped_session(self.scoped_session) as session:
--> 668 trial_model = models.TrialModel.find_or_raise_by_id(trial_id, session)
669 if trial_model.state.is_finished():
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_rdb\models.py in find_or_raise_by_id(cls, trial_id, session, for_update)
240
--> 241 trial = query.one_or_none()
242 if trial is None:
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\orm\query.py in one_or_none(self)
2788 """
-> 2789 return self._iter().one_or_none()
2790
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\orm\query.py in _iter(self)
2846 statement = self._statement_20()
-> 2847 result = self.session.execute(
2848 statement,
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\orm\session.py in execute(self, statement, params, execution_options, bind_arguments, _parent_execute_state, _add_event, **kw)
1687 else:
-> 1688 conn = self._connection_for_bind(bind)
1689 result = conn._execute_20(statement, params or {}, execution_options)
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\orm\session.py in _connection_for_bind(self, engine, execution_options, **kw)
1528 if self._transaction is not None or self._autobegin():
-> 1529 return self._transaction._connection_for_bind(
1530 engine, execution_options
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\orm\session.py in _connection_for_bind(self, bind, execution_options)
746 else:
--> 747 conn = bind.connect()
748 local_connect = True
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in connect(self, close_with_result)
3165
-> 3166 return self._connection_cls(self, close_with_result=close_with_result)
3167
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in __init__(self, engine, connection, close_with_result, _branch_from, _execution_options, _dispatch, _has_events, _allow_revalidate)
95 if connection is not None
---> 96 else engine.raw_connection()
97 )
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in raw_connection(self, _connection)
3244 """
-> 3245 return self._wrap_pool_connect(self.pool.connect, _connection)
3246
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in _wrap_pool_connect(self, fn, connection)
3214 if connection is None:
-> 3215 Connection._handle_dbapi_exception_noconnection(
3216 e, dialect, self
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in _handle_dbapi_exception_noconnection(cls, e, dialect, engine)
2068 elif should_wrap:
-> 2069 util.raise_(
2070 sqlalchemy_exception, with_traceback=exc_info[2], from_=e
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\util\compat.py in raise_(***failed resolving arguments***)
206 try:
--> 207 raise exception
208 finally:
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in _wrap_pool_connect(self, fn, connection)
3211 try:
-> 3212 return fn()
3213 except dialect.dbapi.Error as e:
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in connect(self)
306 """
--> 307 return _ConnectionFairy._checkout(self)
308
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in _checkout(cls, pool, threadconns, fairy)
766 if not fairy:
--> 767 fairy = _ConnectionRecord.checkout(pool)
768
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in checkout(cls, pool)
424 def checkout(cls, pool):
--> 425 rec = pool._do_get()
426 try:
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\impl.py in _do_get(self)
255 def _do_get(self):
--> 256 return self._create_connection()
257
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in _create_connection(self)
252
--> 253 return _ConnectionRecord(self)
254
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in __init__(self, pool, connect)
367 if connect:
--> 368 self.__connect()
369 self.finalize_callback = deque()
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in __connect(self)
610 with util.safe_reraise():
--> 611 pool.logger.debug("Error on connect(): %s", e)
612 else:
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\util\langhelpers.py in __exit__(self, type_, value, traceback)
69 if not self.warn_only:
---> 70 compat.raise_(
71 exc_value,
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\util\compat.py in raise_(***failed resolving arguments***)
206 try:
--> 207 raise exception
208 finally:
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in __connect(self)
604 self.starttime = time.time()
--> 605 connection = pool._invoke_creator(self)
606 pool.logger.debug("Created new connection %r", connection)
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\create.py in connect(connection_record)
577 return connection
--> 578 return dialect.connect(*cargs, **cparams)
579
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\default.py in connect(self, *cargs, **cparams)
583 # inherits the docstring from interfaces.Dialect.connect
--> 584 return self.dbapi.connect(*cargs, **cparams)
585
OperationalError: (sqlite3.OperationalError) unable to open database file
(Background on this error at: https://sqlalche.me/e/14/e3q8)
The above exception was the direct cause of the following exception:
StorageInternalError Traceback (most recent call last)
<ipython-input-5-b47477278440> in <module>
2 study = optuna.create_study(study_name='Conv_serial', storage='sqlite:///db.sqlite3', direction='minimize',
3 pruner=pruner, load_if_exists=True)
----> 4 study.optimize(objective, n_trials=100)
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\study.py in optimize(self, func, n_trials, timeout, n_jobs, catch, callbacks, gc_after_trial, show_progress_bar)
399 )
400
--> 401 _optimize(
402 study=self,
403 func=func,
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\_optimize.py in _optimize(study, func, n_trials, timeout, n_jobs, catch, callbacks, gc_after_trial, show_progress_bar)
63 try:
64 if n_jobs == 1:
---> 65 _optimize_sequential(
66 study,
67 func,
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\_optimize.py in _optimize_sequential(study, func, n_trials, timeout, catch, callbacks, gc_after_trial, reseed_sampler_rng, time_start, progress_bar)
160
161 try:
--> 162 trial = _run_trial(study, func, catch)
163 except Exception:
164 raise
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\_optimize.py in _run_trial(study, func, catch)
241 # `Study.tell` may raise during trial post-processing.
242 try:
--> 243 study.tell(trial, values=values, state=state)
244 except Exception:
245 raise
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\study.py in tell(self, trial, values, state)
663 self._storage.set_trial_values(trial_id, values)
664
--> 665 self._storage.set_trial_state(trial_id, state)
666
667 def set_user_attr(self, key: str, value: Any) -> None:
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_cached_storage.py in set_trial_state(self, trial_id, state)
212 updates.datetime_complete = datetime.datetime.now()
213 cached_trial.datetime_complete = datetime.datetime.now()
--> 214 return self._flush_trial(trial_id)
215
216 ret = self._backend.set_trial_state(trial_id, state)
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_cached_storage.py in _flush_trial(self, trial_id)
425 return True
426 del study.updates[number]
--> 427 return self._backend._update_trial(
428 trial_id=trial_id,
429 values=updates.values,
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_rdb\storage.py in _update_trial(self, trial_id, state, values, intermediate_values, params, distributions_, user_attrs, system_attrs, datetime_start, datetime_complete)
765 )
766
--> 767 session.add(trial_model)
768
769 return True
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\contextlib.py in __exit__(self, type, value, traceback)
133 value = type()
134 try:
--> 135 self.gen.throw(type, value, traceback)
136 except StopIteration as exc:
137 # Suppress StopIteration *unless* it's the same exception that
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_rdb\storage.py in _create_scoped_session(scoped_session, ignore_integrity_error)
69 "e.g. exceeding max length. "
70 )
---> 71 raise optuna.exceptions.StorageInternalError(message) from e
72 except Exception:
73 session.rollback()
StorageInternalError: An exception is raised during the commit. This typically happens due to invalid data in the commit, e.g. exceeding max length.
Code
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Dropout, Reshape, Flatten, Conv1D, InputLayer
from tensorflow.keras import regularizers
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.optimizers import Adam
import optuna
from optuna.integration import TFKerasPruningCallback
earlystop = EarlyStopping(patience=1000, mode='min', restore_best_weights=True)
def create_model(trial):
model = Sequential()
model.add(InputLayer(input_shape=vars.N))
model.add(Reshape(target_shape=(-1, 1))) #add channel dimension
for i in range(trial.suggest_int('num_layers', 2, 20, log=True)):
model.add(Conv1D(filters=trial.suggest_int('units_conv_' + str(i),
low=2,
high=64,
step=2),
kernel_size=trial.suggest_int('conv_size_' + str(i),
low=32,
high=256,
step=32),
padding="same",
activation='relu'))
model.add(Flatten())
model.add(Dense(trial.suggest_int('dense_1', low=32, high=512, step=32), activation='relu'))
model.add(Dense(trial.suggest_int('dense_2', low=32, high=512, step=32), activation='relu'))
model.add(Dense(vars.Q))
model.compile(
optimizer=Adam(trial.suggest_categorical('learning_rate', [1e-2, 1e-3, 1e-4])),
loss='MSE',
metrics=['mean_absolute_error'])
return model
def objective(trial):
model = create_model(trial)
model.fit(X_train, y_train, validation_data=(X_test, y_test), batch_size=1000,
epochs=5000, callbacks=[earlystop, TFKerasPruningCallback(trial, 'val_loss')], verbose=0)
score = model.evaluate(X_test, y_test, verbose=0)
return score[1]
pruner = optuna.pruners.MedianPruner(n_startup_trials=10, n_warmup_steps=1000, interval_steps=100)
study = optuna.create_study(study_name='Conv_serial', storage='sqlite:///db.sqlite3', direction='minimize',
pruner=pruner, load_if_exists=True)
study.optimize(objective, n_trials=100)
Data
-
data: link
-
load data:
X_train, y_train, X_test, y_test = pickle.load( open( "save.p", "rb" ) )
Issue Analytics
- State:
- Created 2 years ago
- Comments:9
Top Results From Across the Web
1567657 – Very large repositories make no progress pruning ...
I am trying to verify this bug with below steps: 1. Configure registry storage backend to emptyDir 2. Push image to internal registry...
Read more >dsmsvc service is down with crash dump while pruning activity ...
dsmsvc may crash while pruning activity log. This crash occurs after upgrade Tivoli Storage Manager Server from 7.1.1.300 or lower level to ...
Read more >My daily saga with backup pruning errors - cPanel Forums
Hi! As written in the docs, the error comes when the pruning process is timing out (5 minutes). I would look at network...
Read more >Pruning objects | OpenShift Container Platform 3.11
Currently, to prune images you must first log in to the CLI as a user with an access token. · Pruning images removes...
Read more >Known Issues - Setting up the Registry | OKD 3.11
... Pull of Internally Managed Image Fails with "not found" Error; Image Push Fails with "500 Internal Server Error" on S3 Storage; Image...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@nzw0301 No, it isn’t possible to randomly generate the data. But it seems the issue had to do with syncing the file while a study was being carried on, which led to the file being blocked. I deactivated syncing on
*.sqlite3
files and until now it is working fine. I will let you know if this is definitely the issue once the study finishes. Thanks for your help!Some additional info. I was running optuna-dashboard when this error occurred, then I erased the study through the dashboard. Afterward started the study again an runned into the same problem, but now I can’t access the database even with optuna-dashboard, also getting OperationalError. I don’t want to erase the database since I have a previous study that I would like to keep. Tried the sqlite3 “.dump” but it also fails to open the database. I will try with a brand new database file to see if the problem persists.