question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

A possible race condition between dialect initialization and query execution

See original GitHub issue

Describe the bug Hi, As the title says, we face a possible race condition between dialect initialization and query execution. We’re using SQLAlchemy of version 1.4.8 with psycopg2 engine of version 2.8.6, and PostgreSQL database of version 12.4. We have a web server responsible of querying this database, and to do so it’s using a session factory (sessionmaker) object attached to an engine we create with a pool_size=20 limitation.

On one of the queries the server performs we use the ILIKE operator on a textual column, and set it to escape \ characters:

column.ilike(f"%{value}%", escape="\\")

and when we execute this query it raises the following exception:

DataError: (psycopg2.errors.InvalidEscapeSequence) invalid escape string
HINT:  Escape string must be empty or one character.

The compiled query that’s attached to the exception contains this condition:

WHERE my_column ILIKE %(value1_1)s ESCAPE '\\' 

(Obviously the escaped string is supposed to contain only one slash character).

While trying to debug this issue using breakpoints and debug prints - we noticed the render_literal_value method of the PGCompiler class, and it looks related to our issue: image

A direct suspect from this method is the dialect object that’s attached to the compiler and its _backslash_escapes attribute specifically.

After digging into it - we noticed that this attribute is set to True by default on the PGDialect class, and its value is determined again upon the PGDialect instance initialization (on initialize method): image image

By planting debug prints on the initialize method of PGDialect and the render_literal_value method of the PGCompiler, we observed the fact that sometimes the first rendering is called before the initialization takes place. At this state, the SQL Compilation Caching (https://docs.sqlalchemy.org/en/14/core/connections.html#sql-compilation-caching) introduced in SQLAlchemy 1.4 prevents the compiler instance from rendering ILIKE operators differently, and all queries are failing.

To support our suspicion, we tried to disable the compilation caching, and from this point the behavior was as “expected”:

  • The first query fails with the same error (and we still see the initialization takes place after the first ILIKE rendering).
  • All other queries from this moment on succeed.

Expected behavior The query is expected to succeed from the first time it’s executed.

To Reproduce We’ve done everything we can to reproduce this error on ipython, but couldn’t (the debug prints say the initialization happens before the first rendering). Here is the session creation mechanism we use:

engine = create_engine(
        <connection_string>,
        executemany_mode="values",
        executemany_values_page_size=10000,
        executemany_batch_page_size=500,
        pool_size=20,
        pool_timeout=30,
        max_overflow=0,
        connect_args={"application_name": <some application name>},
    )
session_factory = sessionmaker(bind=engine, autoflush=False)
session = session_factory()

Versions.

  • OS: CentOS 7
  • Python: 3.8.5
  • SQLAlchemy: 1.4.8
  • Database: PostgreSQL 12.4

We’d really appreciate your attention on this issue. Thanks!

Have a nice day!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:6
  • Comments:10 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
zzzeekcommented, Apr 21, 2021

well pg has a bad symptom here but all of the dialects should not be used to compile until initialize.

0reactions
yuvalmarcianocommented, Apr 22, 2021

@zzzeek @CaselIT Thank you so much for the quick response and fix!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Race condition in double-checked locking object initialization
Double-checked locking is a common pattern for lazy initialization of a field accessed by multiple threads.
Read more >
What is a race condition? - Stack Overflow
A race condition occurs when two or more threads can access shared data and they try to change it at the same time....
Read more >
What is a Race Condition? | Baeldung on Computer Science
By definition, a race condition is a condition of a program where its behavior depends on relative timing or interleaving of multiple threads...
Read more >
What is a Race Condition? - TechTarget
A race condition is an undesirable situation that occurs when a device or system attempts to perform two or more operations at the...
Read more >
Python Threading: The Complete Guide
Race Conditions ; Thread Deadlocks; Thread Livelocks. Python Threading Common Questions. How to Stop a Thread? How to Kill a Thread?
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found