Fitting an XGBoost model results in InternalHashError: unhashable type: 'bytearray'
See original GitHub issueSummary
My app is attempting to train and predict with an XGBoost model from the xgboost
package. I am fitting my model in a cached function (using @st.cache
) and then returning it.
However, it says it cannot hash bytearray
objects in one of my builtins.dict
objects. This is strange because before I updated Streamlit to 0.59, it was able to do so.
If you don’t know where the object of type builtins.dict is coming from, try looking at the hash chain below for an object that you do recognize, then pass that to hash_funcs instead:
Object of type builtins.dict: {'feature_names': ['per_capita_crime_rate_by_town', 'proportion_of_residential_land_zoned_for_lots_over_25,000_sq.ft.', 'proportion_of_non-retail_business_acres_per_town.', 'Charles_River_dummy_variable_(1_if_tract_bounds river;_0_otherwise)', 'nitric_oxides_concentration_(parts_per_10_million)', 'average_number_of_rooms_per_dwelling', 'proportion_of_owner-occupied_units_built_prior_to_1940', 'weighted_distances_to_five_Boston_employment_centres', 'index_of_accessibility_to_radial_highways', 'full-value_property-tax_rate_per_$10,000', 'pupil-teacher_ratio_by_town', '1000(Bk-0.63)^2_where_Bk_is_the_proportion_of_blacks_by_town', '%_lower_status_of_the_population'], 'feature_types': ['float', 'float', 'float', 'int', 'float', 'float', 'float', 'float', 'int', 'int', 'float', 'float', 'float'], 'handle': bytearray(b'\x00\x00\x00?\r\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x.....'), 'booster': 'gbtree', 'best_iteration': 99, 'best_ntree_limit': 100} Object of type xgboost.core.Booster: <xgboost.core.Booster object at 0x1a1eb79690> Object of type builtins.tuple: ('_Booster', <xgboost.core.Booster object at 0x1a1eb79690>) Object of type builtins.dict: {'max_depth': 3, 'learning_rate': 0.1, 'n_estimators': 100, 'verbosity': 1, 'silent': None, 'objective': 'reg:linear', 'booster': 'gbtree', 'gamma': 0, 'min_child_weight': 1, 'max_delta_step': 0, 'subsample': 1, 'colsample_bytree': 1, 'colsample_bylevel': 1, 'colsample_bynode': 1, 'reg_alpha': 0, 'reg_lambda': 1, 'scale_pos_weight': 1, 'base_score': 0.5, 'missing': nan, 'kwargs': {}, '_Booster': <xgboost.core.Booster object at 0x1a1eb79690>, 'seed': None, 'random_state': 0, 'nthread': None, 'n_jobs': 1, 'importance_type': 'gain'} Object of type xgboost.sklearn.XGBRegressor: XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, gamma=0, importance_type='gain', learning_rate=0.1, max_delta_step=0, max_depth=3, min_child_weight=1, missing=None, n_estimators=100, n_jobs=1, nthread=None, objective='reg:linear', random_state=0, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None, silent=None, subsample=1, verbosity=1) Object of type builtins.tuple: (array([23.016954 , 31.42364 , 16.173046 , 23.580927 , 17.46015 , 22.1714 , 18.314796 , 14.029961 , 20.737488 , 21.180895 , 20.44529 , 18.690483 , 8.321284 , 21.453217 , 20.421919 , 24.553173 , 19.685305 , 10.205381 , 44.475704 , 15.940252 , 23.858517 , 23.737234 , 13.884621 , 20.765696 , 15.456101 , 16.24305 , 21.799377 , 13.161657 , 19.93968 , 21.674849 , 19.766438 , 23.370852 , 23.209932 , 19.655743 , 15.145709 , 16.75448 , 32.84218 , 20.021385 , 20.638344 , 23.61842 , 17.877428 , 30.510242 , 43.739815 , 20.179007 , 22.488018 , 14.906468 , 16.279074 , 23.69828 , 18.070068 , 26.881145 , 20.835695 , 35.763424 , 16.517195 , 25.812237 , 47.97466 , 21.505997 , 16.060717 , 31.166424 , 21.966013 , 18.112715 , 22.984049 , 34.817833 , 30.661045 , 19.36766 , 25.49301 , 18.369967 , 14.297357 , 23.17898 , 28.338715 , 14.903171 , 21.480898 , 28.35976 , 10.96798 , 21.158417 , 22.444817 , 7.4538136, 20.583168 , 44.752457 , 11.438277 , 13.292285 , 21.415586 , 11.619457 , 19.344286 , 10.624224 , 19.948153 , 27.053463 , 16.849163 , 23.626413 , 25.075293 , 16.859615 , 21.492283 , 9.272746 , 19.522285 , 19.510963 , 23.251637 , 19.985184 , 37.08692 , 11.051673 , 12.942635 , 10.689882 , 20.378206 , 22.951574 , 13.608057 , 20.544075 , 19.422127 , 12.785345 , 19.35001 , 25.687586 , 20.213074 , 23.25188 , 8.632252 , 13.273444 , 22.108204 , 23.423246 , 32.223713 , 14.496059 , 42.171318 , 15.669648 , 21.15128 , 23.506771 , 19.632486 , 23.542662 , 6.652353 , 20.856237 , 24.006516 , 22.47102 , 21.927805 ], dtype=float32), XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1, gamma=0, importance_type='gain', learning_rate=0.1, max_delta_step=0, max_depth=3, min_child_weight=1, missing=None, n_estimators=100, n_jobs=1, nthread=None, objective='reg:linear', random_state=0, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None, silent=None, subsample=1, verbosity=1))
Steps to reproduce
here is a sample code snippet to run and reproduce the error
from sklearn.datasets import load_boston
from xgboost import XGBRegressor
from sklearn.model_selection import train_test_split
x, y = load_boston(return_X_y=True)
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.3, random_state=42)
ypred, model = train_and_predict_regression(xtrain, ytrain, xtest)
@st.cache(suppress_st_warning=True)
def train_and_predict_regression(xtrain, ytrain, xtest):
"""trains model with training data and predicts"""
try:
model = XGBoostRegressor(objective='reg:squarederror')
model.fit(xtrain, ytrain)
ypred = model.predict(xtest)
return ypred, model
except ValueError as er:
st.error(er)
Expected behavior:
To be able to
Actual behavior:
Explain the buggy behavior you experience when you go through the steps above. If applicable, add screenshots to help explain your problem.
Is this a regression?
That is, did this use to work the way you expected in the past? yes
Debug info
- Streamlit version: (get it with
$ streamlit version
) - Python version: (get it with
$ python --version
) - Using Conda? PipEnv? PyEnv? Pex?
- OS version:
- Browser version:
Additional information
If needed, add any other context about the problem here. For example, did this bug come from https://discuss.streamlit.io or another site? Link the original source here!
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
We’re failing to hash
bytearray
Awesome thank you!
On Tue, May 26, 2020 at 5:36 PM Jonathan Rhone notifications@github.com wrote:
– Vinh Nguyen Data Scientist/Analyst @ Amruta Inc. vinh.nguyen@amrutainc.com 703 862 6538