Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

incompatible with XGBoost 1.5.0 due to breaking model format changes

See original GitHub issue

In xgboost 1.5.0 release, a breaking change is introduced to the model format causing this error when trying to use convert:

ValueError: invalid literal for int() with base 10: '<some_feature_name>'

The error happens because this line: https://github.com/microsoft/hummingbird/blob/main/hummingbird/ml/operator_converters/xgb.py#L33

Previously, XGBoost seemed to use an indexed integer to store the feature name, but now it’s changed to the alphabet name of the feature. I tried changing the conversion to use other bases, say 36 to get some 1-1 integer map to the feature name, but is causing another issue down the road in the indexing. Seemed like the lib may need to build some indexing to the feature.

Dataset used: sklearn.datasets.load_boston

Issue Analytics

State:
Created 2 years ago
Comments:7

Top GitHub Comments

1reaction

interesaaatcommented, Jan 8, 2022

Still we will try to fix the pandas problem for training because it is not that convenient to force users to use numpy for training xgboost models.

1reaction

metacommented, Jan 8, 2022

sure, here’s the repro notebook: https://github.com/meta/notebooks/blob/main/hummingbird_xgboost.ipynb

Top Results From Across the Web

Versions 1.5 and later conflict with xgboost · Issue #770 - GitHub

I wonder whether something might have changed in the Indigo namespace in version 1.5.0? It does not look like the example notebook I...

Release 1.5.0 xgboost developers

It's subject to change due to the beta status. For an example of parsing XGBoost tree model, see /demo/json-model.

How to save & load xgboost model? - python - Stack Overflow

The save_model() method recognize the format of the file name, if *.json is specified, then model is saved in JSON, otherwise it is...

Cross-version Testing in MLflow - The Databricks Blog

We want to detect such breaking changes as early as possible, ... that should not be supported due to unacceptable issues unsupported: ...

Changelog • pmml - Open Source | Software AG

Breaking Changes. pmml.ARIMA() no longer exports models with the Exact Least Squares method. Exports where ts_type = "arima" produce PMML with Conditional ...