AttributeError: 'SuperLearner' object has no attribute 'scores_'
See original GitHub issueHi, I have found an issue about SuperLearner.scores_.
I have fitted an SuperLearner ensemble and I wanted to check the CV scores of base learners by typing pd.DataFrame(ensemble.scores_)
. However, an error occurs:
AttributeError: 'SuperLearner' object has no attribute 'scores_'
This is wired. 1) I have checked my instantiated and fitted ensemble
. There is indeed no scores_
attribute. 2) I’ve never seen this issue before.
(And what make me crushed is that I spent a long time fitting this ensemble, only to found I can’t see how my base learners behave…)
Anyway, here is my code:
ensemble = SuperLearner(scorer=mean_absolute_error, folds=5, random_state=seed, n_jobs=-1, shuffle=True)
ensemble.add([('et01', ExtraTreesRegressor(n_estimators=..., max_depth=..., n_jobs=-1)),
('et02', ExtraTreesRegressor(n_estimators=..., max_depth=..., n_jobs=-1)),
('et03', ExtraTreesRegressor(n_estimators=..., max_depth=..., n_jobs=-1)),
('xgb01', XGBRegressor(n_estimators=..., max_depth=..., learning_rate=..., nthread=20)),
('xgb02', XGBRegressor(n_estimators=..., max_depth=..., learning_rate=..., nthread=20)),
('xgb03', XGBRegressor(n_estimators=..., learning_rate=..., max_depth=..., gamma=..., nthread=20)),
('rf01', RandomForestRegressor(n_estimators=..., n_jobs=-1)),
('rf02', RandomForestRegressor(n_estimators=..., n_jobs=-1)),
('rf03', RandomForestRegressor(n_estimators=..., n_jobs=-1)),
('ridge01', Ridge(alpha=...)),
('ridge02', Ridge(alpha=...)),
('ridge03', Ridge(alpha=...)),
('lasso01', Lasso(alpha=...)),
('lasso02', Lasso(alpha=...)),
('lasso03', Lasso(alpha=...)),
('lgbm01', LGBMRegressor(n_estimators=..., learning_rate=...)),
('lgbm02', LGBMRegressor(n_estimators=..., learning_rate=...)),
('lgbm03', LGBMRegressor(n_estimators=..., learning_rate=...)),
('mlp01', MLPRegressor(hidden_layer_sizes=(...,))),
('mlp02', MLPRegressor(hidden_layer_sizes=(...,)))
])
ensemble.add_meta(Ridge(alpha=..., fit_intercept=False))
ensemble.fit(X, y)
print pd.DataFrame(ensemble.scores_)
So, my question is:
Q1. Is this the problem of my code or mlens ? I’m using 0.1.6 version
Q2. If this is the problem of my code, where should I change ?
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:5 (4 by maintainers)
Top Results From Across the Web
AttributeError: 'module' object has no attribute 'scores'
I am getting an error when trying to use the function precision from nltk.metrics.scores . I have tried many different imports but with...
Read more >ML-Ensemble: Scikit-learn style ensemble learning | Kaggle
Explore and run machine learning code with Kaggle Notebooks | Using data from Zillow ... DataFrame(evl.summary) AttributeError: 'Evaluator' object has no ...
Read more >How to Make Predictions with scikit-learn
How do I make predictions with my model in scikit-learn? ... AttributeError: 'str' object has no attribute 'classify'.
Read more >Using Super Learner Prediction Modeling to Improve High ...
The high-dimensional propensity score is a semiautomated variable selection algorithm that can supplement expert knowledge to improve confounding control in ...
Read more >'RandomForestClassifier' object has no attribute 'oob_score_ ...
AttributeError : 'RandomForestClassifier' object has no attribute ... oob_score = True) scores = clf.oob_score_ cv_scores.append(scores).
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Btw if you’re prototyping the ensemble architecture, for large enough datasets (say over 10k samples) it won’t that much a difference if you use 2 or 5-fold CV (if you’re data is iid). So you can save yourself some time using 2-fold when experimenting, and push up the fold count for the final ensemble.
Also give the
BlendEnsemble
orSubsemble
classes a try. They are much faster and for large enough datasets they perform similarly to the Super Learner. The Subsemble can actually do better.(final note: if you’re running the ensemble with
n_jobs > 1
, settingn_jobs != 1
on the base learners won’t make a difference, except for the meta layer.)Closed as of #67.