question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Crash under specific input example

See original GitHub issue

Describe the bug

I was trying to create a minimal working example for an issue we have on real data (KDDCup). Along the way I found this (different) error raised when producing predictions.

I’m fine with a won't fix but I figured I would share so you can see if it has a more serious underlying issue.

To Reproduce

Installed from development branch.

import numpy as np
from autosklearn.experimental.askl2 import AutoSklearn2Classifier

x = np.random.random(size=(150, 4))
y = np.asarray([1]*75 + [2]*74 + [3])

aml = AutoSklearn2Classifier(time_left_for_this_task=60)
aml.fit(x, y)
predictions = aml.predict(x)

The single sample for class 3 seems rather crucial, I tried other configurations but they would not produce the error.

Expected behavior

Predictions to be produced.

Actual behavior, stacktrace or logfile

(venv) root@486c0ae472af:/bench# python mwe.py
/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/smac/intensification/parallel_scheduling.py:152: UserWarning: SuccessiveHalving is intended to be used with more than 1 worker but num_workers=1
  num_workers
[WARNING] [2021-07-27 15:07:04,115:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 1. Number of dummy models: 1
Traceback (most recent call last):
  File "mwe.py", line 9, in <module>
    predictions = aml.predict(x)
  File "/bench/frameworks/autosklearn/lib/auto-sklearn/autosklearn/estimators.py", line 695, in predict
    return super().predict(X, batch_size=batch_size, n_jobs=n_jobs)
  File "/bench/frameworks/autosklearn/lib/auto-sklearn/autosklearn/estimators.py", line 494, in predict
    return self.automl_.predict(X, batch_size=batch_size, n_jobs=n_jobs)
  File "/bench/frameworks/autosklearn/lib/auto-sklearn/autosklearn/automl.py", line 1703, in predict
    n_jobs=n_jobs)
  File "/bench/frameworks/autosklearn/lib/auto-sklearn/autosklearn/automl.py", line 1230, in predict
    for identifier in self.ensemble_.get_selected_model_identifiers()
  File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/parallel.py", line 1041, in __call__
    if self.dispatch_one_batch(iterator):
  File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/parallel.py", line 859, in dispatch_one_batch
    self._dispatch(tasks)
  File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/parallel.py", line 777, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/parallel.py", line 263, in __call__
    for func, args, kwargs in self.items]
  File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/parallel.py", line 263, in <listcomp>
    for func, args, kwargs in self.items]
  File "/bench/frameworks/autosklearn/lib/auto-sklearn/autosklearn/automl.py", line 96, in _model_predict
    prediction = model.predict_proba(X_)
  File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/sklearn/ensemble/_voting.py", line 329, in _predict_proba
    avg = np.average(self._collect_probas(X), axis=0,
  File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/sklearn/ensemble/_voting.py", line 324, in _collect_probas
    return np.asarray([clf.predict_proba(X) for clf in self.estimators_])
ValueError: could not broadcast input array from shape (150,3) into shape (150,)

Environment and installation:

Please give details about your installation:

  • OS: Debian 10 in docker hosted by Windows 10
  • virtual environment
  • Python version: 3.7.11
  • Auto-sklearn version: development (11afae22b8c9a6309d2b6fcf7cfb9a947711cd1e)

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
eddiebergmancommented, Sep 3, 2021

I meant that np.asarray and np.array should be identical, as far as I know, np.asarray is just a wrapper around np.array with some extra functionality for extra kinds of types.

As for how the different shapes come about, I imagine it’s something specific to asklearn2 but its just a gut feeling, I would have to investigate it properly. If the pipeline is different from the normal AutoML class then I might guess it’s related to the issue fixed in #1218 but it’s just a guess. No point guessing until it’s looked into.

0reactions
mfeurercommented, Sep 14, 2021

Fixed via #1218 and #1245. @eddiebergman could you please open a new issue if you find that scikit-learn warning again?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Javascript: Crash when variable is assigned by user's input
When you compare using !==, it does not only compare by value, but by type, in this case, finger is a string, while...
Read more >
R crashes with very specific input · Issue #35 - GitHub
When looping through a large number of samples, I found that ExomeDepth would induce a crash on very specific files. This manifests as...
Read more >
How to make exceptions with 'try' and 'except' blocks in Python
Overview. We can make exceptions with the try and except blocks in Python as an easy way to handle error types such as...
Read more >
programming practices - Coerce bad input or always crash early
Short answer: No. Every function you write runs in its only tiny little world that only knows about the argument values passed to...
Read more >
Sudden Crashes on Keyboard Input and File Download
A character is inserted, or the file is downloaded. Either way, a browser crash is unexpected. What happens instead? The browser crashes. In...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found