question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ValueError: Dummy prediction failed with run state StatusType.MEMOUT

See original GitHub issue

Describe the bug

I’m trying to do a benchmark of automl algorithms, and of course the first one that I always run is autosklearn. The problem is that in depends of what algorithm I import on the top of the python file autosklearn will run or not or will run with lots of warning. In particular I don’t understand why after call the autosklearn function for solving a classification problem it logs warning about tensorflow…it’s strange because autosklearn hasn’t got any dependencies on tensorflow.

To Reproduce

import openml
from algorithms.auto_sklearn import autoSklearn_class
from algorithms.tpot import tpot_class
from algorithms.auto_keras import autokeras_class
from algorithms.h2o import h2o_class
from algorithms.ludwig import ludwig_class

X, y = fetch_openml(data_id=727, as_frame=True, return_X_y=True, cache=True)
y = y.to_frame()
X[y.columns[0]] = y
df = X


print(autoSklearn_class(df))
print(tpot_class(df))
print(autokeras_class(df))
print(h2o_class(df))
print(ludwig_class(df))

All function run correctly stand alone. I’ve tried also to change the order of the import but nothing. In addition all of them report to a single python file which contains the code of that specific algorithms, in particular this is the one of autosklearn:

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import autosklearn.classification
import pandas as pd

def autoSklearn_class(df):
  for col in df.columns:
    t = pd.api.types.infer_dtype(df[col])
    if t == "string" or t == 'object':
      df[col] = df[col].astype('category')

  y = df.iloc[:, -1:]
  X = df.iloc[:, 0:df.shape[1]-1]

  X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)
  automl = autosklearn.classification.AutoSklearnClassifier(
        time_left_for_this_task=1*60,
        per_run_time_limit=30,
        n_jobs=-1,
  )
  automl.fit(X_train, y_train)
  y_pred = automl.predict(X_test)
  return (accuracy_score(y_test, y_pred))

Actual behavior, stacktrace or logfile

And this is the log error that is printed out:

/home/riccardo/.local/lib/python3.8/site-packages/pyparsing.py:3190: FutureWarning: Possible set intersection at position 3
  self.re = re.compile(self.reString)
2021-04-08 23:23:46.434750: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-08 23:23:46.434788: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/riccardo/.local/lib/python3.8/site-packages/typeguard/__init__.py:917: UserWarning: no type annotations present -- not typechecking tensorflow_addons.layers.max_unpooling_2d.MaxUnpooling2D.__init__
  warn('no type annotations present -- not typechecking {}'.format(function_name(func)))
--------------------START--------------------
INFO:root:Starting [get] request for the URL https://www.openml.org/api/v1/xml/data/list/limit/10000/offset/0
INFO:root:1.2244403s taken for [get] request for the URL https://www.openml.org/api/v1/xml/data/list/limit/10000/offset/0
/home/riccardo/.local/lib/python3.8/site-packages/pyparsing.py:3190: FutureWarning: Possible set intersection at position 3
  self.re = re.compile(self.reString)
2021-04-08 23:23:53.128767: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-08 23:23:53.128806: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/riccardo/.local/lib/python3.8/site-packages/typeguard/__init__.py:917: UserWarning: no type annotations present -- not typechecking tensorflow_addons.layers.max_unpooling_2d.MaxUnpooling2D.__init__
  warn('no type annotations present -- not typechecking {}'.format(function_name(func)))
2021-04-08 23:23:55.764039: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-08 23:23:55.764080: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/riccardo/.local/lib/python3.8/site-packages/typeguard/__init__.py:917: UserWarning: no type annotations present -- not typechecking tensorflow_addons.layers.max_unpooling_2d.MaxUnpooling2D.__init__
  warn('no type annotations present -- not typechecking {}'.format(function_name(func)))
[ERROR] [2021-04-08 23:23:58,132:Client-AutoML(1):b682384b-98b0-11eb-a9e2-b313928a5b4b] Dummy prediction failed with run state StatusType.MEMOUT and additional output: {'error': 'Memout (used more than 3072 MB).', 'configuration_origin': 'DUMMY'}.
Traceback (most recent call last):
  File "start.py", line 143, in <module>
    print(autoSklearn_class(df))
  File "/home/riccardo/Desktop/AutoML-Benchmark/algorithms/auto_sklearn.py", line 30, in autoSklearn_class
    automl.fit(X_train, y_train)
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/estimators.py", line 592, in fit
    super().fit(
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/estimators.py", line 357, in fit
    self.automl_.fit(load_models=self.load_models, **kwargs)
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/automl.py", line 1413, in fit
    return super().fit(
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/automl.py", line 623, in fit
    self._do_dummy_prediction(datamanager, num_run)
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/automl.py", line 436, in _do_dummy_prediction
    raise ValueError(
ValueError: Dummy prediction failed with run state StatusType.MEMOUT and additional output: {'error': 'Memout (used more than 3072 MB).', 'configuration_origin': 'DUMMY'}.

After this it will block and doing a CTRL+C the logs will be these:

Error in atexit._run_exitfuncs:
Process ForkServerProcess-1:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 931, in wait
    ready = selector.select(timeout)
Traceback (most recent call last):
  File "/usr/lib/python3.8/selectors.py", line 415, in select
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/util/logging_.py", line 295, in start_log_server
    receiver.serve_until_stopped()
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/util/logging_.py", line 325, in serve_until_stopped
    rd, wr, ex = select.select([self.socket.fileno()],
KeyboardInterrupt
    fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt
^C

Environment and installation:

Please give details about your installation:

  • OS: Ubuntu 20.04.2 LTS
  • Is your installation in a virtual environment or conda environment? No
  • Python version: 3.8.5
  • Auto-sklearn version: 0.12.5

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:11 (7 by maintainers)

github_iconTop GitHub Comments

6reactions
Rob192commented, Sep 21, 2021

Hello, I run into the same error myself. With n_jobs=-1 I use memory_limit=None to prevent the error for happening. Best,

3reactions
NicolasMICAUXcommented, Nov 14, 2021

Same issue, with a pandas df of 5 rows lol. Solved by memory_limit=None, now works for 1000+ rows.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Dummy prediction failed with run state StatusType.CRASHED ...
I am trying to train a simple classification model on the iris dataset using auto-sklearn. When I try to fit my model, I...
Read more >
Dummy prediction failed with run state StatusType.MEMOUT ...
Get ValueError: Dummy prediction failed with run state StatusType.MEMOUT in fit function. def Clf_trainer(x_train, y_train, save_path, ...
Read more >
APIs — AutoSklearn 0.15.0 documentation
During training, auto-sklearn fits each model k times on the dataset, but does not keep any trained model and can therefore not be...
Read more >
automl - Bountysource
ValueError : Dummy prediction failed with run state StatusType.MEMOUT $ 0 ... Autosklearn fails with a category data error on a particular dataset....
Read more >
AutoML | AutoSklearn的基本分类、回归、多输出回归和多标签 ...
... Dummy prediction failed with run state StatusType.MEMOUT and additional output: {'error': 'Memout (used more than 3072 MB).
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found