-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
Labels
Description
Describe the bug
I was trying to create a minimal working example for an issue we have on real data (KDDCup).
Along the way I found this (different) error raised when producing predictions.
I'm fine with a won't fix
but I figured I would share so you can see if it has a more serious underlying issue.
To Reproduce
Installed from development
branch.
import numpy as np
from autosklearn.experimental.askl2 import AutoSklearn2Classifier
x = np.random.random(size=(150, 4))
y = np.asarray([1]*75 + [2]*74 + [3])
aml = AutoSklearn2Classifier(time_left_for_this_task=60)
aml.fit(x, y)
predictions = aml.predict(x)
The single sample for class 3 seems rather crucial, I tried other configurations but they would not produce the error.
Expected behavior
Predictions to be produced.
Actual behavior, stacktrace or logfile
(venv) root@486c0ae472af:/bench# python mwe.py
/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/smac/intensification/parallel_scheduling.py:152: UserWarning: SuccessiveHalving is intended to be used with more than 1 worker but num_workers=1
num_workers
[WARNING] [2021-07-27 15:07:04,115:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 1. Number of dummy models: 1
Traceback (most recent call last):
File "mwe.py", line 9, in <module>
predictions = aml.predict(x)
File "/bench/frameworks/autosklearn/lib/auto-sklearn/autosklearn/estimators.py", line 695, in predict
return super().predict(X, batch_size=batch_size, n_jobs=n_jobs)
File "/bench/frameworks/autosklearn/lib/auto-sklearn/autosklearn/estimators.py", line 494, in predict
return self.automl_.predict(X, batch_size=batch_size, n_jobs=n_jobs)
File "/bench/frameworks/autosklearn/lib/auto-sklearn/autosklearn/automl.py", line 1703, in predict
n_jobs=n_jobs)
File "/bench/frameworks/autosklearn/lib/auto-sklearn/autosklearn/automl.py", line 1230, in predict
for identifier in self.ensemble_.get_selected_model_identifiers()
File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/parallel.py", line 1041, in __call__
if self.dispatch_one_batch(iterator):
File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/parallel.py", line 859, in dispatch_one_batch
self._dispatch(tasks)
File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/parallel.py", line 777, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
result = ImmediateResult(func)
File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 572, in __init__
self.results = batch()
File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/parallel.py", line 263, in __call__
for func, args, kwargs in self.items]
File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/joblib/parallel.py", line 263, in <listcomp>
for func, args, kwargs in self.items]
File "/bench/frameworks/autosklearn/lib/auto-sklearn/autosklearn/automl.py", line 96, in _model_predict
prediction = model.predict_proba(X_)
File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/sklearn/ensemble/_voting.py", line 329, in _predict_proba
avg = np.average(self._collect_probas(X), axis=0,
File "/bench/frameworks/autosklearn/venv/lib/python3.7/site-packages/sklearn/ensemble/_voting.py", line 324, in _collect_probas
return np.asarray([clf.predict_proba(X) for clf in self.estimators_])
ValueError: could not broadcast input array from shape (150,3) into shape (150,)
Environment and installation:
Please give details about your installation:
- OS: Debian 10 in docker hosted by Windows 10
- virtual environment
- Python version: 3.7.11
- Auto-sklearn version: development (
11afae22b8c9a6309d2b6fcf7cfb9a947711cd1e
)