Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Multiindex Lightgbm problem

See original GitHub issue

I’m not sure whether the problem arises on evalml side or Lightgbm but I have a problem with multiindex X passing to AutoMLSearch.search()

Batch 1: (4/9) LightGBM Regressor w/ Imputer            Elapsed:00:01
	Starting cross validation
			Fold 0: Encountered an error.
			Fold 0: All scores will be replaced with nan.
			Fold 0: Please check ...\evalml_debug.log for the current hyperparameters and stack trace.
			Fold 0: Exception during automl search: logical_types contains columns that are not present in dataframe: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
			Fold 1: Encountered an error...

Sorry, I can’t give you a proper example. I still think this is better than nothing.

On my side, I fixed the problem with

X_train.pipe(lambda df: df.set_axis(['_'.join(col).strip() for col in df.columns.values], axis=1))

In addition, being still a novice in using the library, but I can’t find a proper way to see logs of errors. I see the same information in evalml_debug.log without a clear traceback of the error. This is why I don’t know the exact reason for the problem.

Issue Analytics

State:
Created 3 years ago
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

angela97lincommented, Feb 11, 2021

@grayskripko We recently released a new version of EvalML (0.18.2) which introduces a fix for MultiIndex issues. Please try it out and let us know if this fixes your original issue! 😁

1reaction

angela97lincommented, Jan 20, 2021

Hi @grayskripko, thank you for filing! RE a proper way to surface the traceback of the error, try:

from evalml.automl.callbacks import raise_error_callback
automl = AutoMLSearch(..., error_callback=raise_error_callback)
automl.search()

Having that traceback could be helpful for us to understand what is happening and better assist you 😄

Top Results From Across the Web

Cannot Run qrun with lightgbm workflow config #798 - GitHub

Bug Description. (qlib) examples main qrun benchmarks/LightGBM/workflow_config_lightgbm_Alpha158.yaml

Why does tuple come out from LGBM? - Stack Overflow

I found what I was missing. X_train was a DataFrame that has multi-index. X_train.columns = X_train.columns.get_level_values(0).

Python Pandas MultiIndex and reading data from SQL Server

Python Pandas multiIndex is a hierarchical indexing over multiple tuples or arrays of data, enabling advanced dataframe wrangling and analysis ...

Dask DataFrame is not Pandas | Saturn Cloud Blog

Either some of the methods you rely on in Pandas, are not implemented in Dask DataFrame (I'm looking at you, MultiIndex), the behavior...

Working Solution: Categorical Encodings - Kaggle

In this exercise you'll apply more advanced encodings to encode the categorical variables ito improve your classifier model. The encodings you will implement ......