Multiindex Lightgbm problem
See original GitHub issueI’m not sure whether the problem arises on evalml side or Lightgbm but I have a problem with multiindex X passing to AutoMLSearch.search()
Batch 1: (4/9) LightGBM Regressor w/ Imputer Elapsed:00:01
Starting cross validation
Fold 0: Encountered an error.
Fold 0: All scores will be replaced with nan.
Fold 0: Please check ...\evalml_debug.log for the current hyperparameters and stack trace.
Fold 0: Exception during automl search: logical_types contains columns that are not present in dataframe: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
Fold 1: Encountered an error...
Sorry, I can’t give you a proper example. I still think this is better than nothing.
On my side, I fixed the problem with
X_train.pipe(lambda df: df.set_axis(['_'.join(col).strip() for col in df.columns.values], axis=1))
In addition, being still a novice in using the library, but I can’t find a proper way to see logs of errors. I see the same information in evalml_debug.log without a clear traceback of the error. This is why I don’t know the exact reason for the problem.
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
Cannot Run qrun with lightgbm workflow config #798 - GitHub
Bug Description. (qlib) examples main qrun benchmarks/LightGBM/workflow_config_lightgbm_Alpha158.yaml
Read more >Why does tuple come out from LGBM? - Stack Overflow
I found what I was missing. X_train was a DataFrame that has multi-index. X_train.columns = X_train.columns.get_level_values(0).
Read more >Python Pandas MultiIndex and reading data from SQL Server
Python Pandas multiIndex is a hierarchical indexing over multiple tuples or arrays of data, enabling advanced dataframe wrangling and analysis ...
Read more >Dask DataFrame is not Pandas | Saturn Cloud Blog
Either some of the methods you rely on in Pandas, are not implemented in Dask DataFrame (I'm looking at you, MultiIndex), the behavior...
Read more >Working Solution: Categorical Encodings - Kaggle
In this exercise you'll apply more advanced encodings to encode the categorical variables ito improve your classifier model. The encodings you will implement ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@grayskripko We recently released a new version of EvalML (0.18.2) which introduces a fix for MultiIndex issues. Please try it out and let us know if this fixes your original issue! 😁
Hi @grayskripko, thank you for filing! RE a proper way to surface the traceback of the error, try:
Having that traceback could be helpful for us to understand what is happening and better assist you 😄