Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Study does not close all connections to RDB.

See original GitHub issue

Expected behavior

Finished studies do not consume database resources.

Environment

Optuna version: 1.3.0
Python version: 3.5
OS: Ubuntu 18.04
MySQL (docker): mysql:8.0.19 (localhost)

Error messages, stack traces, or logs

Traceback (most recent call last):
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 2285, in _wrap_pool_connect
    return fn()
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/pool/base.py", line 363, in connect
    return _ConnectionFairy._checkout(self)
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/pool/base.py", line 773, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/pool/base.py", line 492, in checkout
    rec = pool._do_get()
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/pool/impl.py", line 139, in _do_get
    self._dec_overflow()
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/util/langhelpers.py", line 69, in __exit__
    exc_value, with_traceback=exc_tb,
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
    raise exception
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/pool/impl.py", line 136, in _do_get
    return self._create_connection()
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/pool/base.py", line 308, in _create_connection
    return _ConnectionRecord(self)
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/pool/base.py", line 437, in __init__
    self.__connect(first_connect_check=True)
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/pool/base.py", line 657, in __connect
    pool.logger.debug("Error on connect(): %s", e)
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/util/langhelpers.py", line 69, in __exit__
    exc_value, with_traceback=exc_tb,
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
    raise exception
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/pool/base.py", line 652, in __connect
    connection = pool._invoke_creator(self)
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/engine/strategies.py", line 114, in connect
    return dialect.connect(*cargs, **cparams)
  File "path/to/venv/lib/python3.5/site-packages/sqlalchemy/engine/default.py", line 488, in connect
    return self.dbapi.connect(*cargs, **cparams)
  File "path/to/venv/lib/python3.5/site-packages/pymysql/__init__.py", line 94, in Connect
    return Connection(*args, **kwargs)
  File "path/to/venv/lib/python3.5/site-packages/pymysql/connections.py", line 325, in __init__
    self.connect()
  File "path/to/venv/lib/python3.5/site-packages/pymysql/connections.py", line 598, in connect
    self._get_server_information()
  File "path/to/venv/lib/python3.5/site-packages/pymysql/connections.py", line 975, in _get_server_information
    packet = self._read_packet()
  File "path/to/venv/lib/python3.5/site-packages/pymysql/connections.py", line 684, in _read_packet
    packet.check_error()
  File "path/to/venv/lib/python3.5/site-packages/pymysql/protocol.py", line 220, in check_error
    err.raise_mysql_exception(self._data)
  File "path/to/venv/lib/python3.5/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
    raise errorclass(errno, errval)
pymysql.err.OperationalError: (1040, 'Too many connections')

Steps to reproduce

Use MySQL backend.
Create multiple studies from a single process.

Reproducible examples (optional)

import argparse
import math
import sys

import sqlalchemy
from sqlalchemy.sql import text

import optuna


N_STUDY = 10000
N_TRIAL = 2


def objective(trial):
    ret = 0.0
    for i in range(10):
        ret += math.sin(
            trial.suggest_float('param-{}'.format(i), 0, math.pi * 2))
    return ret


def run(storage, i):
    study = optuna.create_study(
        storage=storage, study_name="study-{}".format(i), load_if_exists=True)
    study.optimize(objective, n_trials=N_TRIAL, show_progress_bar=False)


def define_flags(parser):
    parser.add_argument('mysql_user', type=str)
    parser.add_argument('mysql_password', type=str)
    parser.add_argument('mysql_host', type=str)
    parser.add_argument('mysql_database', type=str)
    return parser


if __name__ == "__main__":
    parser = define_flags(argparse.ArgumentParser())
    args = parser.parse_args()

    storage = 'mysql+pymysql://{}:{}@{}/{}'.format(
        args.mysql_user,
        args.mysql_password,
        args.mysql_host,
        args.mysql_database,
    )
    engine = sqlalchemy.create_engine(storage)
    conn = engine.connect()
    # mysql specific
    get_connection_cnt = text("show status where `Variable_name` = 'Threads_connected'")

    for t in range(N_STUDY):
        conn_cnt = conn.execute(get_connection_cnt).fetchall()
        sys.stdout.write(
            '\r{:0>5}/{:0>5} studies finished. Current connection: {}'.format(
                t, N_STUDY, conn_cnt
            ))
        run(storage, t)
    print('All studies successfully finished.' + ' ' * 20)

    conn.close()

Additional context (optional)

Calling study._storage.engine.dispose() at the end of the run function prevents the increase of connections.

Issue Analytics

State:
Created 3 years ago
Comments:14 (4 by maintainers)

Top GitHub Comments

1reaction

ottofabiancommented, Feb 18, 2022

@wkirgsn One way I could get rid of this problem was by explicitly opening a connection when writing/reading from the DB. Obviously, this is not preferred if you write to it frequently. You can achieve it by initializing your RDBS storage as:

storage = optuna.storages.RDBStorage(url, engine_kwargs={"poolclass": NullPool})

0reactions

wkirgsncommented, Feb 2, 2022

still seeing this problem, in my case with PostgreSQL 12.8 (Ubuntu 12.8-1.pgdg18.04+1), and optuna 2.10.0. The study.optimize() keeps a session open for the full period during its optimization, and ends the session just when the optimize() function ends. That is congesting my pooling where only 100 connections are allowed (and optuna uses 2+ connections per optimize() whysoever). That means I can only have around 50 optimize() processes running in parallel, which is unneccessarily low. The study actually only needs to open a session for a short time, to retrieve some info, and doesn’t need a session for the time of the objective, am I mistaken?