Skip to content

ValueError: Dummy prediction failed with run state StatusType.MEMOUT #1120

@zuliani99

Description

@zuliani99

Describe the bug

I'm trying to do a benchmark of automl algorithms, and of course the first one that I always run is autosklearn. The problem is that in depends of what algorithm I import on the top of the python file autosklearn will run or not or will run with lots of warning. In particular I don't understand why after call the autosklearn function for solving a classification problem it logs warning about tensorflow...it's strange because autosklearn hasn't got any dependencies on tensorflow.

To Reproduce

import openml
from algorithms.auto_sklearn import autoSklearn_class
from algorithms.tpot import tpot_class
from algorithms.auto_keras import autokeras_class
from algorithms.h2o import h2o_class
from algorithms.ludwig import ludwig_class

X, y = fetch_openml(data_id=727, as_frame=True, return_X_y=True, cache=True)
y = y.to_frame()
X[y.columns[0]] = y
df = X


print(autoSklearn_class(df))
print(tpot_class(df))
print(autokeras_class(df))
print(h2o_class(df))
print(ludwig_class(df))

All function run correctly stand alone. I've tried also to change the order of the import but nothing. In addition all of them report to a single python file which contains the code of that specific algorithms, in particular this is the one of autosklearn:

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import autosklearn.classification
import pandas as pd

def autoSklearn_class(df):
  for col in df.columns:
    t = pd.api.types.infer_dtype(df[col])
    if t == "string" or t == 'object':
      df[col] = df[col].astype('category')

  y = df.iloc[:, -1:]
  X = df.iloc[:, 0:df.shape[1]-1]

  X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)
  automl = autosklearn.classification.AutoSklearnClassifier(
        time_left_for_this_task=1*60,
        per_run_time_limit=30,
        n_jobs=-1,
  )
  automl.fit(X_train, y_train)
  y_pred = automl.predict(X_test)
  return (accuracy_score(y_test, y_pred))

Actual behavior, stacktrace or logfile

And this is the log error that is printed out:

/home/riccardo/.local/lib/python3.8/site-packages/pyparsing.py:3190: FutureWarning: Possible set intersection at position 3
  self.re = re.compile(self.reString)
2021-04-08 23:23:46.434750: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-08 23:23:46.434788: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/riccardo/.local/lib/python3.8/site-packages/typeguard/__init__.py:917: UserWarning: no type annotations present -- not typechecking tensorflow_addons.layers.max_unpooling_2d.MaxUnpooling2D.__init__
  warn('no type annotations present -- not typechecking {}'.format(function_name(func)))
--------------------START--------------------
INFO:root:Starting [get] request for the URL https://www.openml.org/api/v1/xml/data/list/limit/10000/offset/0
INFO:root:1.2244403s taken for [get] request for the URL https://www.openml.org/api/v1/xml/data/list/limit/10000/offset/0
/home/riccardo/.local/lib/python3.8/site-packages/pyparsing.py:3190: FutureWarning: Possible set intersection at position 3
  self.re = re.compile(self.reString)
2021-04-08 23:23:53.128767: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-08 23:23:53.128806: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/riccardo/.local/lib/python3.8/site-packages/typeguard/__init__.py:917: UserWarning: no type annotations present -- not typechecking tensorflow_addons.layers.max_unpooling_2d.MaxUnpooling2D.__init__
  warn('no type annotations present -- not typechecking {}'.format(function_name(func)))
2021-04-08 23:23:55.764039: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-04-08 23:23:55.764080: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/riccardo/.local/lib/python3.8/site-packages/typeguard/__init__.py:917: UserWarning: no type annotations present -- not typechecking tensorflow_addons.layers.max_unpooling_2d.MaxUnpooling2D.__init__
  warn('no type annotations present -- not typechecking {}'.format(function_name(func)))
[ERROR] [2021-04-08 23:23:58,132:Client-AutoML(1):b682384b-98b0-11eb-a9e2-b313928a5b4b] Dummy prediction failed with run state StatusType.MEMOUT and additional output: {'error': 'Memout (used more than 3072 MB).', 'configuration_origin': 'DUMMY'}.
Traceback (most recent call last):
  File "start.py", line 143, in <module>
    print(autoSklearn_class(df))
  File "/home/riccardo/Desktop/AutoML-Benchmark/algorithms/auto_sklearn.py", line 30, in autoSklearn_class
    automl.fit(X_train, y_train)
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/estimators.py", line 592, in fit
    super().fit(
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/estimators.py", line 357, in fit
    self.automl_.fit(load_models=self.load_models, **kwargs)
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/automl.py", line 1413, in fit
    return super().fit(
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/automl.py", line 623, in fit
    self._do_dummy_prediction(datamanager, num_run)
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/automl.py", line 436, in _do_dummy_prediction
    raise ValueError(
ValueError: Dummy prediction failed with run state StatusType.MEMOUT and additional output: {'error': 'Memout (used more than 3072 MB).', 'configuration_origin': 'DUMMY'}.

After this it will block and doing a CTRL+C the logs will be these:

Error in atexit._run_exitfuncs:
Process ForkServerProcess-1:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 931, in wait
    ready = selector.select(timeout)
Traceback (most recent call last):
  File "/usr/lib/python3.8/selectors.py", line 415, in select
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/util/logging_.py", line 295, in start_log_server
    receiver.serve_until_stopped()
  File "/home/riccardo/.local/lib/python3.8/site-packages/autosklearn/util/logging_.py", line 325, in serve_until_stopped
    rd, wr, ex = select.select([self.socket.fileno()],
KeyboardInterrupt
    fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt
^C

Environment and installation:

Please give details about your installation:

  • OS: Ubuntu 20.04.2 LTS
  • Is your installation in a virtual environment or conda environment? No
  • Python version: 3.8.5
  • Auto-sklearn version: 0.12.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions