Update default n_jobs used by XGBoost to not be -1
See original GitHub issuehttps://github.com/alteryx/evalml/pull/2410 updates XGBoost by exposing the nthread
parameter passed to XGBoost as n_jobs
. However, while profiling I noticed that the default value of nthreads=-1 (use all threads) performs slower than using 2 threads. Upon further testing, it seems like after a certain number of threads, the performance drops significantly. In my case, the performance dropped after 16 threads (probably because I have 8-cores, 2 threads per core).
XGBoost docs mention that thread contention could significantly slow down performance of the algorithm: https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.core
This issue tracks investigating this and determining if we should change our default value of -1 to something generally more performant. It alarms me that in my example, having just two threads cut the runtime of fit
in half.
I initially had this issue tracking CatBoost too, but after running a few more tests, I think the CatBoost differences are just due to variance, and not too concerning.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
@freddyaboulton @chukarsten @angela97lin here are the perf test results that I collected with XGBoost n_jobs. The ultimate conclusion that I came up with was to use
n_jobs=12
as the default for XGBoost. Let me know what your thoughts are, and we can get to closing this issue out!Did some initial perf tests on looking glass, which I put here. I believe we need to address this looking glass issue before we can move forward with this.