question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

StorageInternalError when using pruning

See original GitHub issue

I am having trouble running a study, after a couple (10 to 20) iterations it stops and raises an StorageInternalError. I have already run another similar study on the same file with no issues. I saw that there are a couple of closed issues regarding this same error but they don’t seem to be the same problem and I couldn’t figure out how to fix it. If anyone could give some pointers to understand the raised error and how to fix it, I would greatly appreciate it.

Edit: It seems the source of the error is an OperationalError described by sqlalchemy doc as:

Exception raised for errors that are related to the database’s operation and not necessarily under the control of the programmer, e.g. an unexpected disconnect occurs, the data source name is not found, a transaction could not be processed, a memory allocation error occurred during processing, etc.

but I am working on a local file and the disk still has plenty of space. What could be causing this?

Environment

  • Optuna version: 2.8.0
  • Python version: 3.9.5
  • OS: Windows-10-10.0.19043-SP0

Error messages, stack traces, or logs

Log
---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in _wrap_pool_connect(self, fn, connection)
 3211         try:
-> 3212             return fn()
 3213         except dialect.dbapi.Error as e:

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in connect(self)
  306         """
--> 307         return _ConnectionFairy._checkout(self)
  308 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in _checkout(cls, pool, threadconns, fairy)
  766         if not fairy:
--> 767             fairy = _ConnectionRecord.checkout(pool)
  768 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in checkout(cls, pool)
  424     def checkout(cls, pool):
--> 425         rec = pool._do_get()
  426         try:

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\impl.py in _do_get(self)
  255     def _do_get(self):
--> 256         return self._create_connection()
  257 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in _create_connection(self)
  252 
--> 253         return _ConnectionRecord(self)
  254 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in __init__(self, pool, connect)
  367         if connect:
--> 368             self.__connect()
  369         self.finalize_callback = deque()

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in __connect(self)
  610             with util.safe_reraise():
--> 611                 pool.logger.debug("Error on connect(): %s", e)
  612         else:

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\util\langhelpers.py in __exit__(self, type_, value, traceback)
   69             if not self.warn_only:
---> 70                 compat.raise_(
   71                     exc_value,

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\util\compat.py in raise_(***failed resolving arguments***)
  206         try:
--> 207             raise exception
  208         finally:

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in __connect(self)
  604             self.starttime = time.time()
--> 605             connection = pool._invoke_creator(self)
  606             pool.logger.debug("Created new connection %r", connection)

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\create.py in connect(connection_record)
  577                         return connection
--> 578             return dialect.connect(*cargs, **cparams)
  579 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\default.py in connect(self, *cargs, **cparams)
  583         # inherits the docstring from interfaces.Dialect.connect
--> 584         return self.dbapi.connect(*cargs, **cparams)
  585 

OperationalError: unable to open database file

The above exception was the direct cause of the following exception:

OperationalError                          Traceback (most recent call last)
C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_rdb\storage.py in _create_scoped_session(scoped_session, ignore_integrity_error)
   52     try:
---> 53         yield session
   54         session.commit()

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_rdb\storage.py in _update_trial(self, trial_id, state, values, intermediate_values, params, distributions_, user_attrs, system_attrs, datetime_start, datetime_complete)
  667         with _create_scoped_session(self.scoped_session) as session:
--> 668             trial_model = models.TrialModel.find_or_raise_by_id(trial_id, session)
  669             if trial_model.state.is_finished():

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_rdb\models.py in find_or_raise_by_id(cls, trial_id, session, for_update)
  240 
--> 241         trial = query.one_or_none()
  242         if trial is None:

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\orm\query.py in one_or_none(self)
 2788         """
-> 2789         return self._iter().one_or_none()
 2790 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\orm\query.py in _iter(self)
 2846         statement = self._statement_20()
-> 2847         result = self.session.execute(
 2848             statement,

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\orm\session.py in execute(self, statement, params, execution_options, bind_arguments, _parent_execute_state, _add_event, **kw)
 1687         else:
-> 1688             conn = self._connection_for_bind(bind)
 1689         result = conn._execute_20(statement, params or {}, execution_options)

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\orm\session.py in _connection_for_bind(self, engine, execution_options, **kw)
 1528         if self._transaction is not None or self._autobegin():
-> 1529             return self._transaction._connection_for_bind(
 1530                 engine, execution_options

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\orm\session.py in _connection_for_bind(self, bind, execution_options)
  746             else:
--> 747                 conn = bind.connect()
  748                 local_connect = True

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in connect(self, close_with_result)
 3165 
-> 3166         return self._connection_cls(self, close_with_result=close_with_result)
 3167 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in __init__(self, engine, connection, close_with_result, _branch_from, _execution_options, _dispatch, _has_events, _allow_revalidate)
   95                 if connection is not None
---> 96                 else engine.raw_connection()
   97             )

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in raw_connection(self, _connection)
 3244         """
-> 3245         return self._wrap_pool_connect(self.pool.connect, _connection)
 3246 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in _wrap_pool_connect(self, fn, connection)
 3214             if connection is None:
-> 3215                 Connection._handle_dbapi_exception_noconnection(
 3216                     e, dialect, self

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in _handle_dbapi_exception_noconnection(cls, e, dialect, engine)
 2068         elif should_wrap:
-> 2069             util.raise_(
 2070                 sqlalchemy_exception, with_traceback=exc_info[2], from_=e

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\util\compat.py in raise_(***failed resolving arguments***)
  206         try:
--> 207             raise exception
  208         finally:

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\base.py in _wrap_pool_connect(self, fn, connection)
 3211         try:
-> 3212             return fn()
 3213         except dialect.dbapi.Error as e:

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in connect(self)
  306         """
--> 307         return _ConnectionFairy._checkout(self)
  308 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in _checkout(cls, pool, threadconns, fairy)
  766         if not fairy:
--> 767             fairy = _ConnectionRecord.checkout(pool)
  768 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in checkout(cls, pool)
  424     def checkout(cls, pool):
--> 425         rec = pool._do_get()
  426         try:

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\impl.py in _do_get(self)
  255     def _do_get(self):
--> 256         return self._create_connection()
  257 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in _create_connection(self)
  252 
--> 253         return _ConnectionRecord(self)
  254 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in __init__(self, pool, connect)
  367         if connect:
--> 368             self.__connect()
  369         self.finalize_callback = deque()

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in __connect(self)
  610             with util.safe_reraise():
--> 611                 pool.logger.debug("Error on connect(): %s", e)
  612         else:

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\util\langhelpers.py in __exit__(self, type_, value, traceback)
   69             if not self.warn_only:
---> 70                 compat.raise_(
   71                     exc_value,

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\util\compat.py in raise_(***failed resolving arguments***)
  206         try:
--> 207             raise exception
  208         finally:

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\pool\base.py in __connect(self)
  604             self.starttime = time.time()
--> 605             connection = pool._invoke_creator(self)
  606             pool.logger.debug("Created new connection %r", connection)

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\create.py in connect(connection_record)
  577                         return connection
--> 578             return dialect.connect(*cargs, **cparams)
  579 

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\sqlalchemy\engine\default.py in connect(self, *cargs, **cparams)
  583         # inherits the docstring from interfaces.Dialect.connect
--> 584         return self.dbapi.connect(*cargs, **cparams)
  585 

OperationalError: (sqlite3.OperationalError) unable to open database file
(Background on this error at: https://sqlalche.me/e/14/e3q8)

The above exception was the direct cause of the following exception:

StorageInternalError                      Traceback (most recent call last)
<ipython-input-5-b47477278440> in <module>
    2 study = optuna.create_study(study_name='Conv_serial', storage='sqlite:///db.sqlite3', direction='minimize',
    3                             pruner=pruner, load_if_exists=True)
----> 4 study.optimize(objective, n_trials=100)

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\study.py in optimize(self, func, n_trials, timeout, n_jobs, catch, callbacks, gc_after_trial, show_progress_bar)
  399             )
  400 
--> 401         _optimize(
  402             study=self,
  403             func=func,

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\_optimize.py in _optimize(study, func, n_trials, timeout, n_jobs, catch, callbacks, gc_after_trial, show_progress_bar)
   63     try:
   64         if n_jobs == 1:
---> 65             _optimize_sequential(
   66                 study,
   67                 func,

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\_optimize.py in _optimize_sequential(study, func, n_trials, timeout, catch, callbacks, gc_after_trial, reseed_sampler_rng, time_start, progress_bar)
  160 
  161         try:
--> 162             trial = _run_trial(study, func, catch)
  163         except Exception:
  164             raise

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\_optimize.py in _run_trial(study, func, catch)
  241     # `Study.tell` may raise during trial post-processing.
  242     try:
--> 243         study.tell(trial, values=values, state=state)
  244     except Exception:
  245         raise

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\study.py in tell(self, trial, values, state)
  663                 self._storage.set_trial_values(trial_id, values)
  664 
--> 665             self._storage.set_trial_state(trial_id, state)
  666 
  667     def set_user_attr(self, key: str, value: Any) -> None:

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_cached_storage.py in set_trial_state(self, trial_id, state)
  212                     updates.datetime_complete = datetime.datetime.now()
  213                     cached_trial.datetime_complete = datetime.datetime.now()
--> 214                 return self._flush_trial(trial_id)
  215 
  216         ret = self._backend.set_trial_state(trial_id, state)

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_cached_storage.py in _flush_trial(self, trial_id)
  425             return True
  426         del study.updates[number]
--> 427         return self._backend._update_trial(
  428             trial_id=trial_id,
  429             values=updates.values,

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_rdb\storage.py in _update_trial(self, trial_id, state, values, intermediate_values, params, distributions_, user_attrs, system_attrs, datetime_start, datetime_complete)
  765                 )
  766 
--> 767             session.add(trial_model)
  768 
  769         return True

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\contextlib.py in __exit__(self, type, value, traceback)
  133                 value = type()
  134             try:
--> 135                 self.gen.throw(type, value, traceback)
  136             except StopIteration as exc:
  137                 # Suppress StopIteration *unless* it's the same exception that

C:\ProgramData\Miniconda3\envs\tf-gpu\lib\site-packages\optuna\storages\_rdb\storage.py in _create_scoped_session(scoped_session, ignore_integrity_error)
   69             "e.g. exceeding max length. "
   70         )
---> 71         raise optuna.exceptions.StorageInternalError(message) from e
   72     except Exception:
   73         session.rollback()

StorageInternalError: An exception is raised during the commit. This typically happens due to invalid data in the commit, e.g. exceeding max length. 

Code

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Dropout, Reshape, Flatten, Conv1D, InputLayer
from tensorflow.keras import regularizers
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.optimizers import Adam
import optuna
from optuna.integration import TFKerasPruningCallback

earlystop = EarlyStopping(patience=1000, mode='min', restore_best_weights=True)

def create_model(trial):
    model = Sequential()
    model.add(InputLayer(input_shape=vars.N))
    model.add(Reshape(target_shape=(-1, 1))) #add channel dimension
    for i in range(trial.suggest_int('num_layers', 2, 20, log=True)):
        model.add(Conv1D(filters=trial.suggest_int('units_conv_' + str(i),
                                            low=2,
                                            high=64,
                                            step=2),
                          kernel_size=trial.suggest_int('conv_size_' + str(i),
                                            low=32,
                                            high=256,
                                            step=32),
                          padding="same",
                          activation='relu'))
    model.add(Flatten())
    model.add(Dense(trial.suggest_int('dense_1', low=32, high=512, step=32), activation='relu'))
    model.add(Dense(trial.suggest_int('dense_2', low=32, high=512, step=32), activation='relu'))
    model.add(Dense(vars.Q))
    model.compile(
        optimizer=Adam(trial.suggest_categorical('learning_rate', [1e-2, 1e-3, 1e-4])),
        loss='MSE',
        metrics=['mean_absolute_error'])
    
    return model

def objective(trial):
    model = create_model(trial)
    model.fit(X_train, y_train, validation_data=(X_test, y_test), batch_size=1000,
          epochs=5000, callbacks=[earlystop, TFKerasPruningCallback(trial, 'val_loss')], verbose=0)

    score = model.evaluate(X_test, y_test, verbose=0)
    return score[1]
pruner = optuna.pruners.MedianPruner(n_startup_trials=10, n_warmup_steps=1000, interval_steps=100)
study = optuna.create_study(study_name='Conv_serial', storage='sqlite:///db.sqlite3', direction='minimize',
                            pruner=pruner, load_if_exists=True)
study.optimize(objective, n_trials=100)

Data

  • data: link

  • load data:


X_train, y_train, X_test, y_test = pickle.load( open( "save.p", "rb" ) )

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9

github_iconTop GitHub Comments

2reactions
grudloffcommented, Aug 3, 2021

@nzw0301 No, it isn’t possible to randomly generate the data. But it seems the issue had to do with syncing the file while a study was being carried on, which led to the file being blocked. I deactivated syncing on *.sqlite3 files and until now it is working fine. I will let you know if this is definitely the issue once the study finishes. Thanks for your help!

1reaction
grudloffcommented, Aug 3, 2021

Some additional info. I was running optuna-dashboard when this error occurred, then I erased the study through the dashboard. Afterward started the study again an runned into the same problem, but now I can’t access the database even with optuna-dashboard, also getting OperationalError. I don’t want to erase the database since I have a previous study that I would like to keep. Tried the sqlite3 “.dump” but it also fails to open the database. I will try with a brand new database file to see if the problem persists.

Read more comments on GitHub >

github_iconTop Results From Across the Web

1567657 – Very large repositories make no progress pruning ...
I am trying to verify this bug with below steps: 1. Configure registry storage backend to emptyDir 2. Push image to internal registry...
Read more >
dsmsvc service is down with crash dump while pruning activity ...
dsmsvc may crash while pruning activity log. This crash occurs after upgrade Tivoli Storage Manager Server from 7.1.1.300 or lower level to ...
Read more >
My daily saga with backup pruning errors - cPanel Forums
Hi! As written in the docs, the error comes when the pruning process is timing out (5 minutes). I would look at network...
Read more >
Pruning objects | OpenShift Container Platform 3.11
Currently, to prune images you must first log in to the CLI as a user with an access token. · Pruning images removes...
Read more >
Known Issues - Setting up the Registry | OKD 3.11
... Pull of Internally Managed Image Fails with "not found" Error; Image Push Fails with "500 Internal Server Error" on S3 Storage; Image...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found