In Boosting Assembler wrapping each estimator into a subroutine causes a performance degradation
See original GitHub issueI’ve recalled the real motivation behind not wrapping every individual estimator into its own subroutine - generation of many nested function calls leads to a performance degradation in Java. The observed difference reaches 4x for larger models (eg. XGBoost with 1000 estimators). The basic test I created (sorry about Scala):
@ import com.github.m2cgen.ModelOld
import com.github.m2cgen.ModelOld
@ import com.github.m2cgen.ModelNew
import com.github.m2cgen.ModelNew
@ def nextRandomData(): Array[Double] = (0 until 4).map(_ => Random.nextDouble).toArray
defined function nextRandomData
@ def testScore: Unit = {
val start = System.currentTimeMillis()
(0 until 100000).foreach(_ => <ModelNew|ModelOld>.score(nextRandomData))
println("Runtime: " + (System.currentTimeMillis() - start).toString)
}
Results for ModelOld
:
@ testScore
Runtime: 2973
For ModelNew
:
@ testScore
Runtime: 10747
The test model has been trained using the sklearn.datasets.load_iris()
dataset. Classifier has been created as following:
model = XGBClassifier(n_estimators=1000)
In the attached archive I included the following:
- ModelNew.java - java code generated with the most recent master.
- ModelOld.java - java code generated with the release
0.5.0
version. - Models.jar - the jar containing both compiled sources.
- xgboost_model2 - the trained estimator in Pickle format.
CC: @StrikerRUS FYI
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
Ensemble methods: bagging, boosting and stacking
Boosting, like bagging, can be used for regression as well as for classification problems. Being mainly focused at reducing bias, the base ...
Read more >The Art of Assembly Language - IC-Unicamp
1.4 Arithmetic Operations on Binary and Hexadecimal Numbers . ... 4.9.1 The UCR Standard Library for 80x86 Assembly Language Programmers ............. 169.
Read more >Improving Pumping System Performance
Piping Configurations to Improve Pumping System Efficiency 29. Basic Pump Maintenance ... can cause a substantial loss in productivity.
Read more >Quality Loss Function - an overview | ScienceDirect Topics
This definition imposes: 1) that the quality loss is additive, and 2) that the function that enables us to calculate it is identical...
Read more >Common PCB Problems & Circuit Board Issues
This contamination can cause PCB components to burn and create connection problems. ... cable assembly and metallic packaging to absorb EMC and reduce...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Can we close this?
Yes, absolutely. Thank you!