question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bug - Deadlock when using multithreaded & multiprocessing Environment

See original GitHub issue

Hi,

Python version: 3.6.8 loguru version: 0.4.1 OS: Linux Dev environment: Terminal

Bug Description: Creating a new process during logging from another thread will cause deadlock inside the new process whenever it calls the logger. I assume that it happens because handler’s _lock is already taken whenever the process is created, resulting that _lock will be taken forever (from the new process perspective) [see 1st comment in here].

Reproduce code:

import multiprocessing as mp
import sys
import threading
from random import uniform
from time import sleep
from loguru import logger

logger.remove()

logger.add(
    sys.stdout,
    colorize=False,
    enqueue=True,  # This bug will reproduce also with enqueue=False
    serialize=False,
    level="DEBUG",
)


def sub_worker():
    while True:
        logger.trace(f"This is the inner thread")
        sleep(0.01)

def worker():
    sleep(uniform(0, 0.1))
    logger.debug(f"Printing some logs")

if __name__ == "__main__":
    threading.Thread(target=sub_worker).start()

    while True:
        w = mp.Process(target=worker, )
        w.start()
        w.join()

        logger.debug(f"Loop ok!")

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13 (6 by maintainers)

github_iconTop GitHub Comments

6reactions
Delgancommented, May 2, 2020

What a journey! There was not one, not two, but three bugs here! 😄

1. Deadlock while mixing threading with multiprocessing (1122e828d3c244b7b834f8446cdfb3c78659e2a3)

This was the easy one. It happens if a process is created while the internal lock of one handler (used for thread-safety) is already acquired. This is only possible if the sink is used by another thread while the child process is created. The child process will inherits the locked lock (due to fork() copying the entire process state), and hence deadlock as soon as a message tries to be logged. This problem is extensively discussed in the issues related to the commits I linked above, it’s very instructive.

It can happen even if enqueue=False and is easily reproducible by using a “slow” sink:

import multiprocessing
import threading
import time
import sys
from loguru import logger

def slow_sink(msg):
    time.sleep(1)
    sys.stderr.write(msg)

def worker():
    logger.info("Working...")

if __name__ == "__main__":
    logger.remove()
    logger.add(slow_sink, enqueue=False)

    thread = threading.Thread(target=worker)
    thread.start()

    process = multiprocessing.Process(target=worker)
    process.start()

    thread.join()
    process.join()

The fix for this relies on the os.register_at_fork() function. We simply need to acquire all locks in the main process before a fork occurs, and release them in both main and child processes once the fork is finished.

Contrary to the logging library, loguru does not use re-entrant locks and does not expose API to acquire them. This eases the process of protecting handlers during fork and avoid others possibles kind of deadlocks that the standard library faced.

2. Deadlock while using enqueue=True with multiprocessing (8c425111390f7c2e8cec3185d18ac9ab111dc0b0)

So, after I fixed the first mentioned issue, I thought I can call it a day and move on, but there were still more troubles to come. 🙂

While using enqueue=True, a thread is internally started to continuously consume log messages put in the queue. Unfortunately, threading and multiprocessing don’t interoperate very well. There was another possible deadlock, due to the fact that sys.stderr (and sys.stdout) itself internally uses a lock which is not protected in case of a fork.

Consequently, if a fork occurs while the internal handler’ thread is printing a logging message, the child process will run into a deadlock if sys.stderr is manipulated. This happens beyond the scope of loguru, as child processes do not use the sink (they simply add log messages to the queue).

import multiprocessing
import sys
from loguru import logger

if __name__ == "__main__":
    logger.remove()
    logger.add(sys.stderr, enqueue=True)

    while True:
        logger.info("No deadlock yet...")
        process = multiprocessing.Process(target=lambda: None)
        process.start()
        process.join()

The fix is quite straightforward using the new locking mechanism previously introduced: we have to protect the sink with a lock even if enqueue=True, so it’s acquired and released during a fork.

3. Deadlock while joining child process with enqueue=True (be597bdc4cc9c38d5ada0a93561eefb6d1f6a7fa)

About this one, I was not able to make a reproducible example, but the multiprocessing documentation is quite explicit about it:

Warning: As mentioned above, if a child process has put items on a queue (and it has not used JoinableQueue.cancel_join_thread), then that process will not terminate until all buffered items have been flushed to the pipe.

This means that if you try joining that process you may get a deadlock unless you are sure that all items which have been put on the queue have been consumed. Similarly, if the child process is non-daemonic then the parent process may hang on exit when it tries to join all its non-daemonic children.

So, it seems possible for a child process to deadlock if it’s exited while items added to the queue have not been fully handled. I’m not sure it apply to SimpleQueue used by loguru, but to be sure, I slightly modified the logger.complete() function. This method can now be called outside the asynchronous functions, it will block until all log messages added to the queue (at the time of the call) have been processed.

0reactions
Delgancommented, Oct 22, 2022

@Helicopt There have been some improvements made recently about deadlocks in #712, you can use the master branch of loguru to perhaps get more debugging information.

pip install git+https://github.com/Delgan/loguru.git@master
Read more comments on GitHub >

github_iconTop Results From Across the Web

Deadlock when mixing threading and multiprocessing
Mixing the spinning up of threads with the forking of processes requires appropriate planning to avoid problems and achieve desired performance.
Read more >
What could be causing a multithreaded/multiprocessed ...
I have a piece of python code that uses multiprocessing. It runs for a few hours and then gets stuck, in what I...
Read more >
Debugging a Multithreaded Program - Oracle Help Center
Deadlocks caused by two threads trying to acquire rights to the same pair of global resources in alternate order. One thread controls the...
Read more >
Race conditions and deadlocks - Visual Basic - Microsoft Learn
This error is random, because it is possible for Thread 1 to complete its execution before it's time on the processor expires, and...
Read more >
Deadlock in Java Multithreading - GeeksforGeeks
It is important to use if our program is running in multi-threaded environment where two or more threads execute simultaneously.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found