question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Postgres lock is not released in specific multi-threaded scenarios

See original GitHub issue

Hello, I started to use this library in a project that I am working on and encountered a strange issue. The issue seems to be somewhat similar to another open issue - #115, but they might be different.

So a little bit about my app - it runs on .Net 6 and uses EF Core as ORM in order to invoke certain actions against Postgres DB. It receives some messages concurrently in different threads, and for each message, the app tries to create a connection to the DB, begin a transaction and then try acquiring a Postgres advisory lock via this library’s API using that DB connection. If the lock is acquired, then some business logic is being done, and then the lock is released. If the lock was not acquired, then the thread re-tries a couple of times every few milliseconds, until it throws an exception if the lock cannot be held. Everything that was described here is being done in a complete asynchronous manner.

The code regarding the lock looks like this (I simplified it):

var dataContext = new DataContext();
var transaction = await dataContext.Database.BeginTransactionAsync(token);
var dbConnection = dataContext.Database.GetDbConnection();

var lockKey = GetLockKey(); // This can be any long
var postgresAdvisoryLockKey = new PostgresAdvisoryLockKey(lockKey);
var postgresDistributedLock = new PostgresDistributedLock(postgresAdvisoryLockKey, dbConnection);

await using (var distributedLockHandle = await postgresDistributedLock.TryAcquireAsync(timeout, token))
{
    // Some business logic...

    // Transaction is committed here 
}

// Transaction and Data Context are disposed here

It actually works, but only under specific multi-threaded scenarios. I ran some simple and short load tests on my app by sending it messages. The app is configured with different number of threads which are getting messages and then trying to acquire the same lock simultaneously. Apparently, when there are only 4 threads (or less) which try to acquire the lock, then it is being held and then released correctly. However, if the app runs with more than 4 threads (lets say 8), then after a few dozens of seconds, one of the thread supposedly releases the lock, and then no other thread can acquire it anymore, as if the lock was never actually released. It happened on every run that I did. I also tried to use the sync dispose function of the lock handle, but it did not change anything.

Now, when I look into the pg_locks table in Postgres using this query: SELECT * FROM pg_locks WHERE locktype = 'advisory'

I can see the following while the threads hang on the lock (8 threads), it stays the same until I kill the app: image

It does seem like the lock was not released all of a sudden, and I do not understand why. I added logs around everything, and it seems like the app is working correctly. Then I found the issue that I mentioned at the start, and started to wonder whether there is a bug in the library.

Am I using the library in a wrong way? Could it be that either the Dispose/AsyncDispose or TryAcquire functions are throwing exceptions and swallow them? Returning null values/Failing to release the lock? Is there any way to check it? I will be glad to hear your thoughts and answer any questions.

Thanks.

Issue Analytics

  • State:closed
  • Created 10 months ago
  • Comments:13 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
madelsoncommented, Nov 12, 2022

Got confirmation that this is Postgres behavior: https://www.postgresql.org/message-id/17686-fb1fa3870138e394%40postgresql.org

Working on a simple fix which is to just re-check whether the lock is acquired after a timeout.

1reaction
madelsoncommented, Nov 10, 2022

Glad that this workaround seems effective. Yes I was referring to looping with retries with sleeps vs. a single wait with a longer timeout. I think in general you’d want the single wait since then you get better fairness (threads stay in line vs repeatedly giving up) and less resource usage due to fewer DB round trips.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Concurency issue with SELECT FOR UPDATE in Postgres ...
We have a solution where multiple worker threads will try to select job requests from a single database/table, by setting a flag on...
Read more >
PostgreSQL Concurrency Issues
(Predicate locking means write-locking all rows that satisfy the WHERE condition of any query executed by any open transaction. It's horridly expensive. ....
Read more >
Locking issue with concurrent DELETE / INSERT in ...
When a READ COMMITTED transaction awakens from a block on a write conflict, it follows that update chain to the end; if the...
Read more >
Thread: Deadlock problem - Postgres Professional
I have an multithreaded java application using postgresql. I am using UR mode (handling locking internally) and wa shoping to have no problems...
Read more >
PostgreSQL locking, Part 1: Row Locks
Scenario : two concurrent transactions are trying to select a row for update. PostgreSQL uses row-level locking in this case. Row level locking...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found