question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] A lot of threads create and destroy when training cascade rcnn

See original GitHub issue

Describe the bug Hi, there. I found there are a lot of threads creating and destroying when training cascade rcnn. However, it is normal when training faster rcnn.

Reproduce Procedure:

MXNET_ENGINE_TYPE=NaiveEngine gdb python3
r detection_train.py --config config/cascade_rcnn/cascade_r101v1_fpn_1x.py

Which config are you using config/cascade_rcnn/cascade_r101v1_fpn_1x.py

**Which dataset are you using ** MSCOCO

Software info Linux, CUDA 9 python: 3.6.6 MXNet: installed by pip https://github.com/TuSimple/simpledet/blob/master/doc/INSTALL.md

How did you set up your MXNet for SimpleDet

GDB will print a lot thread creating and destroying.

Additional context I set naive_engine mode for MXNet, in order to disable the creating of extra threads.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
RogerCherncommented, Dec 13, 2019

Glad to see the issue solved. I will bump up the prebuilt wheel version shortly.

1reaction
RogerCherncommented, Dec 12, 2019

After set breakpoint on the pthread_creat, I got

(gdb) bt
#0  __pthread_create_2_1 (newthread=0x7fffffff7578, attr=0x7ffff1434580, start_routine=0x7ffff12222e0, arg=0x7fffffff6bd0) at pthread_create.c:505
#1  0x00007ffff12229a0 in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
#2  0x00007fff3f09ac1f in ?? () from /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so
#3  0x00007fff3f0a42a9 in ?? () from /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so
#4  0x00007fff3ec4d377 in mxnet::imperative::PushFCompute(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)> const&, nnvm::Op const*, nnvm::NodeAttrs const&, mxnet::Context const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::Resource, std::allocator<mxnet::Resource> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<unsigned int, std::allocator<unsigned int> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&)::{lambda(mxnet::RunContext)#1}::operator()(mxnet::RunContext) const () from /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so
#5  0x00007fff3eb8b454 in ?? () from /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so
#6  0x00007fff3eb90d5b in ?? () from /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so
#7  0x00007fff3eb8df9f in ?? () from /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so
#8  0x00007fff3ec4c7dd in mxnet::imperative::PushFCompute(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)> const&, nnvm::Op const*, nnvm::NodeAttrs const&, mxnet::Context const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::Resource, std::allocator<mxnet::Resource> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<unsigned int, std::allocator<unsigned int> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&) ()
   from /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so
#9  0x00007fff3ec50ffc in mxnet::Imperative::InvokeOp(mxnet::Context const&, nnvm::NodeAttrs const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, mxnet::DispatchMode, mxnet::OpStatePtr) () from /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so
#10 0x00007fff3ec51deb in mxnet::Imperative::Invoke(mxnet::Context const&, nnvm::NodeAttrs const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&) () from /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so
#11 0x00007fff3eb3efb9 in ?? () from /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so
#12 0x00007fff3eb3f5af in MXImperativeInvokeEx () from /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so
#13 0x00007ffff6911e20 in ffi_call_unix64 () from /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so
#14 0x00007ffff691188b in ffi_call () from /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so
#15 0x00007ffff690c01a in _ctypes_callproc () from /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so
#16 0x00007ffff68fffcb in ?? () from /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so
#17 0x00000000005c20e7 in PyObject_Call ()
#18 0x000000000053b656 in PyEval_EvalFrameEx ()
#19 0x00000000005401ef in ?? ()
#20 0x000000000053bc93 in PyEval_EvalFrameEx ()
#21 0x0000000000540b0b in PyEval_EvalCodeEx ()
#22 0x00000000004ec3f7 in ?? ()
#23 0x00000000005c20e7 in PyObject_Call ()
#24 0x0000000000538cab in PyEval_EvalFrameEx ()
#25 0x000000000053fc97 in ?? ()
#26 0x000000000053b83f in PyEval_EvalFrameEx ()
#27 0x000000000053b294 in PyEval_EvalFrameEx ()
#28 0x0000000000540b0b in PyEval_EvalCodeEx ()
#29 0x00000000004ec2e3 in ?? ()
#30 0x00000000005c20e7 in PyObject_Call ()
#31 0x00000000004fbfce in ?? ()
#32 0x00000000005c20e7 in PyObject_Call ()
#33 0x0000000000574db6 in ?? ()
#34 0x00000000005c20e7 in PyObject_Call ()
#35 0x000000000053b656 in PyEval_EvalFrameEx ()
#36 0x000000000054124a in PyEval_EvalCodeEx ()
#37 0x00000000004ec2e3 in ?? ()
#38 0x00000000005c20e7 in PyObject_Call ()
#39 0x0000000000534870 in PyEval_CallObjectWithKeywords ()
#40 0x00007ffff69063fd in ?? () from /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so

Noting very intuitive comes up. I will try with a debug bulid later today.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[BUG] A lot of threads create and destroy when training cascade ...
Describe the bug Hi, there. I found there are a lot of threads creating and destroying when training cascade rcnn. However, it is...
Read more >
Selective Search for Object Detection (C++ / Python)
This tutorial explains selective search for object detection with OpenCV C++ and Python code.
Read more >
Tracking multiple objects with OpenCV - PyImageSearch
This guide will teach you how to perform real-time multi-object tracking using OpenCV, Python, and the eight built-in object tracking ...
Read more >
Deep Learning with PyTorch
provides all the building blocks needed to build neural networks and train them. Fig- ure 1.2 shows a standard setup that loads data,...
Read more >
The Garden of Forking Paths | Kaggle
Some use cases: adapting/merging/ense ① t's waaay too easy to unwantedly create/fork a script; that causes a lot of totally ① unnecessary orphan...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found