question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Slower without `OMP_NUM_THREADS=1` than with `OMP_NUM_THREADS=1`

See original GitHub issue

I tried with threadpool_limits(1, user_api=None): with a not so simple case : https://gitlab.com/paugier/tsp-pythran (branch threadpoolctl) on Debian.

The case uses Pythran (through Transonic but I don’t see how it could change anything for this) to get an extension accelerated with OpenMP. Pythran uses the system OpenBlas library.

To reproduce (sorry, I don’t use Git):

hg clone https://gitlab.com/paugier/tsp-pythran.git
cd tsp-pythran
hg up threadpoolctl
# compile the extension with openmp
transonic tsp.py -pf "-march=native -DUSE_XSIMD -fopenmp"
# wait to get the extension ready
python run-test-omp.py
OMP_NUM_THREADS=1 python run-test-omp.py

The good news is that threadpoolctl manages to reduce the number of threads used with OpenMP. However, I get something strange that I don’t understand:

I’m not sure it’s an issue, but I get something slower with python run-test-omp.py (or OMP_NUM_THREADS=2 python run-test-omp.py) than with OMP_NUM_THREADS=1 python run-test-omp.py.

I actually get the same behavior if the extension is built without OpenMP, i.e. just with transonic tsp.py.

OMP_NUM_THREADS=1 python run-test-omp.py
[{'filename_prefixes': ('libopenblas',),
  'internal_api': 'openblas',
  'module_path': '/home/users/augier3pi/.pyenv/versions/3.7.2/lib/python3.7/site-packages/numpy/.libs/libopenblasp-r0-382c8f3a.3.5.dev.so',
  'n_thread': 1,
  'prefix': 'libopenblas',
  'user_api': 'blas',
  'version': '0.3.5.dev'},
 {'filename_prefixes': ('libiomp', 'libgomp', 'libomp', 'vcomp'),
  'internal_api': 'openmp',
  'module_path': '/usr/lib/x86_64-linux-gnu/libgomp.so.1',
  'n_thread': 1,
  'prefix': 'libgomp',
  'user_api': 'openmp',
  'version': None},
 {'filename_prefixes': ('libopenblas',),
  'internal_api': 'openblas',
  'module_path': '/home/users/augier3pi/.pyenv/versions/3.7.2/lib/python3.7/site-packages/scipy/.libs/libopenblasp-r0-8dca6697.3.0.dev.so',
  'n_thread': 1,
  'prefix': 'libopenblas',
  'user_api': 'blas',
  'version': None},
 {'filename_prefixes': ('libopenblas',),
  'internal_api': 'openblas',
  'module_path': '/usr/lib/libopenblas.so.0',
  'n_thread': 1,
  'prefix': 'libopenblas',
  'user_api': 'blas',
  'version': None}]
start search
run time = 0.43 s
start search
run time = 0.43 s
start search
run time = 0.44 s
start search
run time = 0.46 s


python run-test-omp.py
[{'filename_prefixes': ('libopenblas',),
  'internal_api': 'openblas',
  'module_path': '/home/users/augier3pi/.pyenv/versions/3.7.2/lib/python3.7/site-packages/numpy/.libs/libopenblasp-r0-382c8f3a.3.5.dev.so',
  'n_thread': 4,
  'prefix': 'libopenblas',
  'user_api': 'blas',
  'version': '0.3.5.dev'},
 {'filename_prefixes': ('libiomp', 'libgomp', 'libomp', 'vcomp'),
  'internal_api': 'openmp',
  'module_path': '/usr/lib/x86_64-linux-gnu/libgomp.so.1',
  'n_thread': 4,
  'prefix': 'libgomp',
  'user_api': 'openmp',
  'version': None},
 {'filename_prefixes': ('libopenblas',),
  'internal_api': 'openblas',
  'module_path': '/home/users/augier3pi/.pyenv/versions/3.7.2/lib/python3.7/site-packages/scipy/.libs/libopenblasp-r0-8dca6697.3.0.dev.so',
  'n_thread': 4,
  'prefix': 'libopenblas',
  'user_api': 'blas',
  'version': None},
 {'filename_prefixes': ('libopenblas',),
  'internal_api': 'openblas',
  'module_path': '/usr/lib/libopenblas.so.0',
  'n_thread': 4,
  'prefix': 'libopenblas',
  'user_api': 'blas',
  'version': None}]
start search
run time = 0.57 s
start search
run time = 0.59 s
start search
run time = 0.58 s
start search
run time = 0.58 s

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:20 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
tomMoralcommented, Mar 28, 2019

But I don’t believe threadpoolctl can do anything in this case.

I tend to agree but I am not sure. It looks like openBLAS relies on OMP_NUM_THREADS=1 here but they don’t seem to be checking the omp_get_max_threads to disable the mapping. I did not investigate enough to see if it could be set programatically.

1reaction
tomMoralcommented, Mar 27, 2019

Thus, this seems to be an issue not related to this library no? Note that we just reduced the overhead of the context manager so the results should be even closer now when using threadpool_limits or not.

Let us know if you feel like there is still some issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Use of OMP_NUM_THREADS=1 for Python Multiprocessing
I heard that using OMP_NUM_THREADS=1 before calling a Python script that use multiprocessing make the script faster. Is it true or not ?...
Read more >
use ctffind4 in relion 1.3 - The Grigorieff Lab
I'm afraid this is beginning to sound more like a relion / system-configuration issue than a ctffind issue, and I am not familiar...
Read more >
Introduction to OpenMP
1.1 Introduction. This course is about programming in OpenMP, the shared-memory interface. CPUs got faster at ≈40% per annum until 2003, but since...
Read more >
OpenMP Performance - CDO - Project Management Service
Apart from the fact, that runtime with complex init is much larger than expected ... OpenMP scalability seems to depend not only on...
Read more >
AUTO-07P : - UW Departments Web Server
you may then type make to compile AUTO and its ancillary software. The configure script is ... OMP NUM THREADS=1. To run the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found