Skip to content

Crash when trying to get roc_auc score #875

@alexitkes

Description

@alexitkes

Describe the bug

AutoSklearnClassifier crashes when trying to get its score if metric is set to roc_auc

To Reproduce

Steps to reproduce the behavior:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from autosklearn.classification import AutoSklearnClassifier
from autosklearn.metrics import roc_auc

# Load some classification dataset
bc = load_breast_cancer()
(X, y) = (bc['data'], bc['target'])
(X_train, X_test, y_train, y_test) = train_test_split(X, y, random_state=0, stratify=y)
# Make sure it is a binary classification dataset
print(len(set(y)))
# 2 will be printed

# Accurcy metrics
clf = AutoSklearnClassifier(time_left_for_this_task=45, per_run_time_limit=15, n_jobs=-1)
clf.fit(X_train, y_train)
print(clf.score(X_train, y_train))
print(clf.score(X_test, y_test))
#  Displayed:
# 0.9929577464788732
# 0.9440559440559441

# And now...
clf = AutoSklearnClassifier(time_left_for_this_task=45, per_run_time_limit=15, n_jobs=-1)
clf.fit(X_train, y_train, metric=roc_auc)
print(clf.score(X_train, y_train))
print(clf.score(X_test, y_test))
# Crash!
IndexError                                Traceback (most recent call last)
<ipython-input-9-54810915749d> in <module>
      1 clf = AutoSklearnClassifier(time_left_for_this_task=45, per_run_time_limit=15, n_jobs=-1)
      2 clf.fit(X_train, y_train, metric=roc_auc)
----> 3 print(clf.score(X_train, y_train))
      4 print(clf.score(X_test, y_test))

~/venv-auto/lib/python3.6/site-packages/autosklearn/estimators.py in score(self, X, y)
    535 
    536     def score(self, X, y):
--> 537         return self._automl[0].score(X, y)
    538 
    539     def show_models(self):

~/venv-auto/lib/python3.6/site-packages/autosklearn/automl.py in score(self, X, y)
    731                                task_type=self._task,
    732                                metric=self._metric,
--> 733                                all_scoring_functions=False)
    734 
    735     @property

~/venv-auto/lib/python3.6/site-packages/autosklearn/metrics/__init__.py in calculate_score(solution, prediction, task_type, metric, all_scoring_functions)
    300             score = metric(solution, cprediction)
    301         else:
--> 302             score = metric(solution, prediction)
    303 
    304     return score

~/venv-auto/lib/python3.6/site-packages/autosklearn/metrics/__init__.py in __call__(self, y_true, y_pred, sample_weight)
    123 
    124         if y_type == "binary":
--> 125             y_pred = y_pred[:, 1]
    126         elif isinstance(y_pred, list):
    127             y_pred = np.vstack([p[:, -1] for p in y_pred]).T

Expected behavior

It should display two numbers, just like for accuracy scoring

Environment and installation:

Ubuntu 18.04.4
Python 3.6.9
Auto-Sklearn 0.7.0

Note that I can try to fix it myself, I think I know how it happens.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions