Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Update default n_jobs used by XGBoost to not be -1

See original GitHub issue

https://github.com/alteryx/evalml/pull/2410 updates XGBoost by exposing the nthread parameter passed to XGBoost as n_jobs. However, while profiling I noticed that the default value of nthreads=-1 (use all threads) performs slower than using 2 threads. Upon further testing, it seems like after a certain number of threads, the performance drops significantly. In my case, the performance dropped after 16 threads (probably because I have 8-cores, 2 threads per core).

XGBoost docs mention that thread contention could significantly slow down performance of the algorithm: https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.core

This issue tracks investigating this and determining if we should change our default value of -1 to something generally more performant. It alarms me that in my example, having just two threads cut the runtime of fit in half.

I initially had this issue tracking CatBoost too, but after running a few more tests, I think the CatBoost differences are just due to variance, and not too concerning.

Issue Analytics

State:
Created 2 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

bchen1116commented, Sep 7, 2021

@freddyaboulton @chukarsten @angela97lin here are the perf test results that I collected with XGBoost n_jobs. The ultimate conclusion that I came up with was to use n_jobs=12 as the default for XGBoost. Let me know what your thoughts are, and we can get to closing this issue out!

0reactions

bchen1116commented, Jul 15, 2021

Did some initial perf tests on looking glass, which I put here. I believe we need to address this looking glass issue before we can move forward with this.