Skip to content

Bug in _partial_fit_and_predict_iterative in train_evaluator.py? #1547

@nataliafonzo

Description

@nataliafonzo

Hi,

Working with AutoSklearn2Classifier, I recently got a strange error that might be due to a bug. This is what I am running:

    n_cores = 10
    memory_limit= psutil.virtual_memory().available * 10**-6 // n_cores
    time_left_for_this_task = 7200

    predictor = AutoSklearn2Classifier(memory_limit=memory_limit,
                                       metric=autosklearn_metrics.roc_auc,
                                       n_jobs=n_cores,
                                       time_left_for_this_task=time_left_for_this_task
                                      )

    predictor.fit(x_train, y_train)
    predictor.automl_.runhistory_.data

where x_train is a pandas dataframe and y_train a list. Even though fit runs, after inspecting runhistory_.data it is clear that all models are crashing in line 741 of /autosklearn/evaluation/train_evaluator.py. It seems like self.Y_train in line 741 should be self.X_train instead:

if model.estimator_supports_iterative_fit():
Xt, fit_params = model.fit_transformer(
self.X_train.iloc[train_indices] if hasattr(
self.Y_train, 'iloc') else self.X_train[train_indices],
self.Y_train.iloc[train_indices] if hasattr(
self.Y_train, 'iloc') else self.Y_train[train_indices],
)

I found that a workaround for this issue is providing x_train as numpy. Also tried providing x_train as a list of lists or y_train as pandas series, but neither of them solved the issue. I am using auto-sklearn==0.14.7.

I am wondering if this is the expected behavior or actually a bug. Thanks!

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions