Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Categories should be non-negative numbers ERROR

See original GitHub issue

Hello everyone, I have been using Auto-Sklearn for about a month and am currently attempting to enter phase 2 of my research when I run into a problem. The error is that no problem will occur with a small amount of time left this task, but I know that after 30 minutes of running/searching in the spatial dimensions of possibilities, it crashed. I am also aware that some issues have been opened/closed regarding it, but none of them have actually aided me even though they presents the same kind of problem, as I have tried all of the suggested solutions without success. The following is the error:

raise ValueError('Categories should be non-negative numbers. ' ValueError: Categories should be non-negative numbers. NOTE: floats will be casted to integers.

To ensure the type of my columns in my dataset, before any run of the fit() method I am printing the dtype of each of my column and all of them are well “converted” in category / float64 or int64 which further in the Auto-Sklearn framework will be seen as “Categorical” and “Numerical”. However, the problem is still there and I do not know where to go deeper in the framework to find a solution to this problem. Thank you very much if you have anything else to suggest.

Steps to reproduce the behavior:

Using the version of Auto-Sklearn 0.12.6.
Using this parameter for the classifier:

time_left_for_this_task 180
per_run_time_limit 1
n_jobs 4
memory_limit 5000
seed 85
resampling_strategy holdout
ensemble_size 50

Using that dataset: https://archive.ics.uci.edu/ml/datasets/Estimation+of+obesity+levels+based+on+eating+habits+and+physical+condition+
Splitting (randomly or not it does not matter for the bug I guess) in 10 folds the dataset.
Looping over the folds and drop one which will be used as the test_set for later use (not relevant for the bug) and the remaining 9 folds are used for training.
10 classifier are then outputted from the Auto-ML pipeline over the loop but one of them (at the end) crashed. Note: It is even more weirder than it does not crashed at the beginning.

About stracktrace and log file, I do not know where exactly to call

Please give details about your installation:

OS: Mac OSX Big sur or Ubuntu 18 LTS.
On my Mac OS installed or on an Ubuntu virtual machine.
Python version 3.8.8 for both.
Auto-sklearn version: 0.12.6.

Issue Analytics

State:
Created 2 years ago
Comments:5 (2 by maintainers)

Top GitHub Comments

1reaction

simonprovostcommented, Aug 8, 2021

Everything has been working very well since the update of the 0.13.0 as recommended. I tried over an extensive number of datasets plus hours of run (50 hours in total) and everything is being handled very well. Cheers all!

1reaction

eddiebergmancommented, Aug 8, 2021

@felidsche thank you for another fix for other OSX users!

Seeing as this was addressed in release 0.13.0 I will close this issue but please feel free to re-use this thread if the problem occurs on any version greater 0.13.0