ValueError: Dummy prediction failed with run state StatusType.MEMOUT
See original GitHub issueDescribe the bug
I’m trying to do a benchmark of automl algorithms, and of course the first one that I always run is autosklearn. The problem is that in depends of what algorithm I import on the top of the python file autosklearn will run or not or will run with lots of warning. In particular I don’t understand why after call the autosklearn function for solving a classification problem it logs warning about tensorflow…it’s strange because autosklearn hasn’t got any dependencies on tensorflow.
To Reproduce
import openml
from algorithms.auto_sklearn import autoSklearn_class
from algorithms.tpot import tpot_class
from algorithms.auto_keras import autokeras_class
from algorithms.h2o import h2o_class
from algorithms.ludwig import ludwig_class
X, y = fetch_openml(data_id=727, as_frame=True, return_X_y=True, cache=True)
y = y.to_frame()
X[y.columns[0]] = y
df = X
print(autoSklearn_class(df))
print(tpot_class(df))
print(autokeras_class(df))
print(h2o_class(df))
print(ludwig_class(df))
All function run correctly stand alone. I’ve tried also to change the order of the import but nothing. In addition all of them report to a single python file which contains the code of that specific algorithms, in particular this is the one of autosklearn:
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import autosklearn.classification
import pandas as pd
def autoSklearn_class(df):
for col in df.columns:
t = pd.api.types.infer_dtype(df[col])
if t == "string" or t == 'object':
df[col] = df[col].astype('category')
y = df.iloc[:, -1:]
X = df.iloc[:, 0:df.shape[1]-1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)
automl = autosklearn.classification.AutoSklearnClassifier(
time_left_for_this_task=1*60,
per_run_time_limit=30,
n_jobs=-1,
)
automl.fit(X_train, y_train)
y_pred = automl.predict(X_test)
return (accuracy_score(y_test, y_pred))
Actual behavior, stacktrace or logfile
And this is the log error that is printed out:
/home/riccardo/.local/lib/python3.8/site-packages/pyparsing.py:3190: FutureWarning: Possible set intersection at position 3
self.re = re.compile(self.reString)
2021-04-08 23:23:46.434750: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-08 23:23:46.434788: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/riccardo/.local/lib/python3.8/site-packages/typeguard/__init__.py:917: UserWarning: no type annotations present -- not typechecking tensorflow_addons.layers.max_unpooling_2d.MaxUnpooling2D.__init__
warn('no type annotations present -- not typechecking {}'.format(function_name(func)))
--------------------START--------------------
INFO:root:Starting [get] request for the URL https://www.openml.org/api/v1/xml/data/list/limit/10000/offset/0
INFO:root:1.2244403s taken for [get] request for the URL https://www.openml.org/api/v1/xml/data/list/limit/10000/offset/0
/home/riccardo/.local/lib/python3.8/site-packages/pyparsing.py:3190: FutureWarning: Possible set intersection at position 3
self.re = re.compile(self.reString)
2021-04-08 23:23:53.128767: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-08 23:23:53.128806: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/riccardo/.local/lib/python3.8/site-packages/typeguard/__init__.py:917: UserWarning: no type annotations present -- not typechecking tensorflow_addons.layers.max_unpooling_2d.MaxUnpooling2D.__init__
warn('no type annotations present -- not typechecking {}'.format(function_name(func)))
2021-04-08 23:23:55.764039: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-08 23:23:55.764080: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/riccardo/.local/lib/python3.8/site-packages/typeguard/__init__.py:917: UserWarning: no type annotations present -- not typechecking tensorflow_addons.layers.max_unpooling_2d.MaxUnpooling2D.__init__
warn('no type annotations present -- not typechecking {}'.format(function_name(func)))
[ERROR] [2021-04-08 23:23:58,132:Client-AutoML(1):b682384b-98b0-11eb-a9e2-b313928a5b4b] Dummy prediction failed with run state StatusType.MEMOUT and additional output: {'error': 'Memout (used more than 3072 MB).', 'configuration_origin': 'DUMMY'}.
Traceback (most recent call last):
File "start.py", line 143, in <module>
print(autoSklearn_class(df))
File "/home/riccardo/Desktop/AutoML-Benchmark/algorithms/auto_sklearn.py", line 30, in autoSklearn_class
automl.fit(X_train, y_train)
File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/estimators.py", line 592, in fit
super().fit(
File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/estimators.py", line 357, in fit
self.automl_.fit(load_models=self.load_models, **kwargs)
File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/automl.py", line 1413, in fit
return super().fit(
File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/automl.py", line 623, in fit
self._do_dummy_prediction(datamanager, num_run)
File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/automl.py", line 436, in _do_dummy_prediction
raise ValueError(
ValueError: Dummy prediction failed with run state StatusType.MEMOUT and additional output: {'error': 'Memout (used more than 3072 MB).', 'configuration_origin': 'DUMMY'}.
After this it will block and doing a CTRL+C the logs will be these:
Error in atexit._run_exitfuncs:
Process ForkServerProcess-1:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/connection.py", line 931, in wait
ready = selector.select(timeout)
Traceback (most recent call last):
File "/usr/lib/python3.8/selectors.py", line 415, in select
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/util/logging_.py", line 295, in start_log_server
receiver.serve_until_stopped()
File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/util/logging_.py", line 325, in serve_until_stopped
rd, wr, ex = select.select([self.socket.fileno()],
KeyboardInterrupt
fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt
^C
Environment and installation:
Please give details about your installation:
- OS: Ubuntu 20.04.2 LTS
- Is your installation in a virtual environment or conda environment? No
- Python version: 3.8.5
- Auto-sklearn version: 0.12.5
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:11 (7 by maintainers)
Hello, I run into the same error myself. With
n_jobs=-1
I usememory_limit=None
to prevent the error for happening. Best,Same issue, with a pandas df of 5 rows lol. Solved by
memory_limit=None
, now works for 1000+ rows.