incompatible with XGBoost 1.5.0 due to breaking model format changes
See original GitHub issueIn xgboost 1.5.0 release, a breaking change is introduced to the model format causing this error when trying to use convert:
ValueError: invalid literal for int() with base 10: '<some_feature_name>'
The error happens because this line: https://github.com/microsoft/hummingbird/blob/main/hummingbird/ml/operator_converters/xgb.py#L33
Previously, XGBoost seemed to use an indexed integer to store the feature name, but now it’s changed to the alphabet name of the feature. I tried changing the conversion to use other bases, say 36 to get some 1-1 integer map to the feature name, but is causing another issue down the road in the indexing.
Seemed like the lib may need to build some indexing to the feature.
Dataset used: sklearn.datasets.load_boston
Issue Analytics
- State:
- Created 2 years ago
- Comments:7
Top Results From Across the Web
Versions 1.5 and later conflict with xgboost · Issue #770 - GitHub
I wonder whether something might have changed in the Indigo namespace in version 1.5.0? It does not look like the example notebook I...
Read more >Release 1.5.0 xgboost developers
It's subject to change due to the beta status. For an example of parsing XGBoost tree model, see /demo/json-model.
Read more >How to save & load xgboost model? - python - Stack Overflow
The save_model() method recognize the format of the file name, if *.json is specified, then model is saved in JSON, otherwise it is...
Read more >Cross-version Testing in MLflow - The Databricks Blog
We want to detect such breaking changes as early as possible, ... that should not be supported due to unacceptable issues unsupported: ...
Read more >Changelog • pmml - Open Source | Software AG
Breaking Changes. pmml.ARIMA() no longer exports models with the Exact Least Squares method. Exports where ts_type = "arima" produce PMML with Conditional ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Still we will try to fix the pandas problem for training because it is not that convenient to force users to use numpy for training xgboost models.
sure, here’s the repro notebook: https://github.com/meta/notebooks/blob/main/hummingbird_xgboost.ipynb