Assertion errors from promise when using graphene-django.
See original GitHub issueI’m looking for advice as I haven’t succeeded in creating a small test case to demonstrate what I’m seeing.
We have a large Django 1.9/Python 2.7 application that we recently added graphene to. Graphene generally works well. However, in development, a front end engineer wrote some code which resulted in 8 separate graphql requests being sent simultaneously to Django’s development runserver (threading is on).
Sometimes this works, but sometimes we see one of two problems:
-
A request is started in graphene/graphql/promise but never returns. We may have observed that terminating Django runserver with Ctrl-C may have caused a dead request or two to finish. That may have been our imagination.
-
Occasionally we see this exception:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/graphql/execution/executor.py", line 337, in complete_value_catching
return completed.catch(handle_error)
File "/usr/local/lib/python2.7/dist-packages/promise/promise.py", line 476, in catch
return self.then(None, on_rejection)
File "/usr/local/lib/python2.7/dist-packages/promise/promise.py", line 537, in then
return self._then(did_fulfill, did_reject)
File "/usr/local/lib/python2.7/dist-packages/promise/promise.py", line 484, in _then
target._add_callbacks(did_fulfill, did_reject, promise)
File "/usr/local/lib/python2.7/dist-packages/promise/promise.py", line 334, in _add_callbacks
assert not self._rejection_handler0
AssertionError
We never see this if we turn off threading.
How can we proceed with finding out what is going on? Any advice on narrowing this down? So far I have not been able to produce a small test which reproduces this.
This is a cross post from my original here: https://github.com/graphql-python/graphene-django/issues/421
Please feel free to delete this issue if my cross post isn’t appropriate.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:18
- Comments:10
Top GitHub Comments
Dear @darrint, I’ve spent a whole day & night digging in the code of graphene-django, graphql-core, and promise, to understand why the multithreaded execution strategy on my GraphQL Websocket server eventually leads to hanging of all the worker threads! Now I am happy to find out I am not the only one who faced such an issue!
My conclusions are that indeed the
promise
library (at least of version 2.1) is not thread-safe! In particular, I see that something goes wrong in the global instanceasync_instance
(filepromise.py
) of the classAsync
. I am not 100% sure I understand the exact reason correctly, but I see some suspicious code pieces inside. Consider the two threads with the same tracebacks are in the following snippet from thepromise.py
:Looks like THREAD1 the will not invoke
self.schedule.call(self.drain_queues)
in spite of is has added a message to theself.normal_queue
in the_async_invoke
invocation. So looks like we have a hanging message/event in result. I suppose that check & assignment ofself.is_tick_used
shall be done atomically to avoid this.I see there are modifications in the related files in the
master
branch, but my test hands anyway, so I can conclude the issue is still here.Anyway, the @melancholy workaround works well, thank you very much for it!
Upon further reflection and investigation, this library appears to not be thread safe when trampoline is enabled. It appends and drains from a single queue within checks of other variables on a global object without locking sections.
it seems either monkey patching as I mentioned above, or using the following will work. I haven’t done any investigation into how this affects performance.