question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Properly working function that uses pythonnet imported function stalls as a celery task.

See original GitHub issue

Environment

  • pythonnet 2.4.0.
  • python 3.7.4
  • OS: Void Linux

Details

I am working on a program that needs to use another dept.'s dotnet core dll’s and it works swimmingly when executed normally. For example, considering this code:

tasks.py:

import os
import sys
sys.path.append("./dll_lib")
import clr
clr.AddReference("System.IO")
import System.IO

from celery import Celery
app = Celery('Test', broker='amqp://guest:guest@localhost:5672//', backend="rpc://")

@app.task
def test(json_file):
    json_content = System.IO.File.ReadAllText(json_file)
    return 1

start.sh:

#!/bin/bash

CELERY_ALWAYS_EAGER=True
CELERYD_MAX_TASKS_PER_CHILD=1
CELERYD_CONCURRENCY=1
CELERYD_PREFETCH_MULTIPLIER=1
CELERY_RESULT_BACKEND="rpc://"
CELERY_RESULT_PERSISTENT=False

celery -A tasks worker --loglevel=INFO --max-tasks-per-child=1

When I test this task it runs fine as a standalone function. But this problem is nebulous, because I can issue a task with the path to a small json (<1mb) and it succeeds:

$ ipython -i
Python 3.7.4 (default, Aug 13 2019, 20:35:49)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.19.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import tasks

In [2]: res = tasks.test.delay("/path/small.json")

In [3]: res.ready()
Out[3]: True

But when I do this with the path to a bigger json (~7mb), it stalls indefinitely:

In [4]: res_big = tasks.test.delay("/path/bigger.json")

In [5]: res_big.ready()
Out[5]: False

I have posted the same question to the Celery community and it is awful quiet over there. Does anybody have an idea what can be related to this issue ? It looks connected to memory used. This problem is not IO related, as I have tested it by reading a json as a string in the python way and using Newtonsoft.Json.dll to deserialize it. This test behaves along the similiarly with the same files, works with the small json, stalls on the bigger one.

Does anybody have a clue where the culprit may be hiding ? thx. S

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

3reactions
sanderboercommented, Jan 26, 2021

This issue can be closed, I should have realised this type of unexpected irraltional behavior is a tell-tale sign of a lack of thread safety. I don’t know why I assumed celery would account for this, I should give it thread safe functions.

When I move the imports into the threaded function:

import os
import sys
sys.path.append("./dll_lib")

from celery import Celery
app = Celery('Test', broker='amqp://guest:guest@localhost:5672//', backend="rpc://")

@app.task
def test(json_file):
    import clr
    clr.AddReference("System.IO")
    import System.IO
    json_content = System.IO.File.ReadAllText(json_file)
    return 1

It works as expected…

1reaction
lostmsucommented, Jan 26, 2021

This is weird. It sounds like there’s a problem when somebody initializes CLR from one thread and then calls File.ReadAllText from another. It should work.

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - Celery worker hangs without any error
It indicates that the worker get stuck at a tcp connection(you can see 5u in FD column). Some python packages like requests is...
Read more >
Can I use celery to create a task that needs to read json ...
But when I create a celery worker and issue the same succesful function as a task, it stalls on importing the json geometry...
Read more >
Asynchronous Tasks With Django and Celery
Open the terminal window where you're running the Celery worker and stop execution by pressing Ctrl + C .
Read more >
Tasks — Celery 5.3.1 documentation
Tasks are the building blocks of Celery applications. A task is a class that can be created out of any callable. It performs...
Read more >
How to Use Celery for Scheduling Tasks
Anytime you schedule a task, Celery returns an AsyncResult object. You can save that object, and then use it later to see if...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found