A possible race condition between dialect initialization and query execution
See original GitHub issueDescribe the bug
Hi,
As the title says, we face a possible race condition between dialect initialization and query execution.
We’re using SQLAlchemy
of version 1.4.8 with psycopg2
engine of version 2.8.6, and PostgreSQL
database of version 12.4.
We have a web server responsible of querying this database, and to do so it’s using a session factory (sessionmaker
) object attached to an engine we create with a pool_size=20
limitation.
On one of the queries the server performs we use the ILIKE
operator on a textual column, and set it to escape \
characters:
column.ilike(f"%{value}%", escape="\\")
and when we execute this query it raises the following exception:
DataError: (psycopg2.errors.InvalidEscapeSequence) invalid escape string
HINT: Escape string must be empty or one character.
The compiled query that’s attached to the exception contains this condition:
WHERE my_column ILIKE %(value1_1)s ESCAPE '\\'
(Obviously the escaped string is supposed to contain only one slash character).
While trying to debug this issue using breakpoints and debug prints - we noticed the render_literal_value
method of the PGCompiler
class, and it looks related to our issue:
A direct suspect from this method is the dialect object that’s attached to the compiler and its _backslash_escapes
attribute specifically.
After digging into it - we noticed that this attribute is set to True by default on the PGDialect
class, and its value is determined again upon the PGDialect
instance initialization (on initialize
method):
By planting debug prints on the initialize
method of PGDialect
and the render_literal_value
method of the PGCompiler
, we observed the fact that sometimes the first rendering is called before the initialization takes place.
At this state, the SQL Compilation Caching (https://docs.sqlalchemy.org/en/14/core/connections.html#sql-compilation-caching) introduced in SQLAlchemy 1.4 prevents the compiler instance from rendering ILIKE operators differently, and all queries are failing.
To support our suspicion, we tried to disable the compilation caching, and from this point the behavior was as “expected”:
- The first query fails with the same error (and we still see the initialization takes place after the first ILIKE rendering).
- All other queries from this moment on succeed.
Expected behavior The query is expected to succeed from the first time it’s executed.
To Reproduce We’ve done everything we can to reproduce this error on ipython, but couldn’t (the debug prints say the initialization happens before the first rendering). Here is the session creation mechanism we use:
engine = create_engine(
<connection_string>,
executemany_mode="values",
executemany_values_page_size=10000,
executemany_batch_page_size=500,
pool_size=20,
pool_timeout=30,
max_overflow=0,
connect_args={"application_name": <some application name>},
)
session_factory = sessionmaker(bind=engine, autoflush=False)
session = session_factory()
Versions.
- OS: CentOS 7
- Python: 3.8.5
- SQLAlchemy: 1.4.8
- Database: PostgreSQL 12.4
We’d really appreciate your attention on this issue. Thanks!
Have a nice day!
Issue Analytics
- State:
- Created 2 years ago
- Reactions:6
- Comments:10 (8 by maintainers)
well pg has a bad symptom here but all of the dialects should not be used to compile until initialize.
@zzzeek @CaselIT Thank you so much for the quick response and fix!