question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

sqlite3.OperationalError: database is locked

See original GitHub issue

I’m deploying optuna on a single machine through a script that looks like this:

optuna create-study […]

for i in $(seq "$1")
do
    optuna study optimize […] &
done

wait

# (copy trials.db from scratch to shared storage)
[…]

I got some sqlite3.OperationalError: database is locked errors on start (on 7 processes out of 32).

I think that launching all processes at the same time causes them to make requests at the same time (collide). I’m lucky that my hyperparameters also control the computation time so that the processes have very low probability to collide after starting, but I guess this could also be a source of problems in other workflows that are massively parallel.

I believe this could be fixed by increasing the timeout of the sqlite3 backend. There should be an option to do that from the command line and the API of optuna.

Alternatives

I added a sleep in my loop.

for i in $(seq "$1")
do
    optuna study optimize […] &
    sleep 5
done

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:6
  • Comments:16 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
hvycommented, Nov 28, 2021

Let me close this issue since it hasn’t been observed as frequently as before, that is within use cases suitable for SQLite such as not-too-high level of concurrency or with NFS where file locking is non-trivial (this may have been a result of https://github.com/optuna/optuna/pull/1628). If that’d not be the case, please feel free to reopen or comment if this is still an issue. For instance, if the timeout is a common pitfall, then we could possibly consider setting a large timeout by default, similar to how we set pool_pre_ping for MySQL RDBs in https://github.com/optuna/optuna/blob/release-v2.10.0/optuna/storages/_rdb/storage.py#L1134-L1149. Please note that you can also configure the RDB connection using RDBStorage(..., engine_kwargs=...) in user-land.

import optuna

# Relax timeout to circumvent the error. Suitable value depends on environment and e.g. trial/process parallelism. (With my local MacBook Pro and a trial parallelism of 64, a timeout of 100 seemed stable.
# Note that keys/values of `engine_kwargs` depends on the actual RDB backend.
storage = optuna.storages.RDBStorage(url="sqlite:///mystorage.db", engine_kwargs={"connect_args": {"timeout": 100}})
study = optuna.create_study(storage=storage)

And just for the record, rephrasing the documentation, we suggest actually using MySQL or other backends for distributed optimization if possible.

2reactions
louisabrahamcommented, Dec 28, 2019

Note: I was using sqlite3 as backend.

I think this feature can be implemented through the connect_args of sqlalchemy.create_engine.

The parameter to set for sqlite3.connect is timeout (in seconds).

~Another way to implement it would be to have the optuna study optimize command take a n_jobs parameter. Thus, it would handle a multiprocessing.Pool (which would be slightly more efficient than giving the same fixed number of jobs to every process when the computation time varies a lot).~ The advantage is that it would be possible to adapt the timeout linearly: timeout = n_jobs * 5.0. ~However, this is a major design change.~

EDIT: n_jobs is already implemented, I just think it could be improved if it adapted the timeout.

Thus, I suggest to just add a parameter --sqlite-timeout.

Read more comments on GitHub >

github_iconTop Results From Across the Web

OperationalError: database is locked - python - Stack Overflow
OperationalError : database is locked errors indicate that your application is experiencing more concurrency than sqlite can handle in default ...
Read more >
sqlite3.OperationalError: database is locked - Pyrogram
This error occurs when more than one process is using the same session file, that is, when you run two or more clients...
Read more >
How to Fix SQLite Error Database is Locked - Error Code 5
This error code occurs when the user tries to perform two inappropriate operations on a database at the same detail and on the...
Read more >
Fix SQLite Database File is Locked Error - Kernel Data Recovery
Normally, the error occurs when two users try to run transactions on the same tables and change the content. SQLite engine finds it...
Read more >
OperationalError: database is locked - Intellipaat Community
The practical reason for this is usually that the python or django shells have opened a request to the DB and it has...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found