-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Typing fixes and cleanup in estimators.py
#1249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Setting `feature_preprocessors` Throws a ValueError. kwarg is `feature_preprocessor`
* Updated code samples to use ..code:: python
* Fixes failing test by setting random_state for make_x from sklearn.datsets
* Uses full path in `setup.py` instead of relative one
* Added contribution guide
* Fixes not checking for y_test being sparse
* Update example on extending data preprocessing with `no_preprocessing`
Update a link for `scenario` argument for the new SMAC documentation
* Fixes old reference to HPOlibConfigSpace to ConfigSpace* Fixes
Need to check if calling fit twice will work. Edit: Done, |
'mlp', | ||
] | ||
self.include['classifier'] = include_estimators | ||
self.automl_._include['classifier'] = include_estimators |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is due to not building automl_
on fit, we have to set this manually as we do for the estimator.
self.metric = log_loss | ||
|
||
self.automl_._metric = self.metric | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar here
# Set the variables in the SklearnEstimator.automl | ||
self.automl_._resampling_strategy = resampling_strategy | ||
self.automl_._resampling_strategy_arguments = resampling_strategy_kwargs | ||
self.automl_._get_smac_object_callback = smac_callback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An here
Note: Built from PR Fix sparse y test #1248, should be considered after that.
Note2: Rebasing added some other PR's into this -_-
This PR fixes some typing issues that were present, notably the
self.automl_
attribute is now created during__init__()
rather than on the first call tofit
, removing the_build_automl()
function.This was primarily to get rid of the typing warning that
self.automl_.predict: None has no attribute predict()
, as mypy can not establish thatself.automl_
had been set in any methods.This PR also does some other small typing cleanups.
Union[SUPPORTED_TARGET_TYPES, spmatrix] -> SUPPORTED_TARGET_TYPES
self.per_run_time_limit
to be set at constructionAutoSklearnEstimator
as anABC
AutoSklearnEstimator
now relies on subclasses having two@abstractmethod
so that validation only has to occur once and not in each subclass._supported_target_types
_automl_cls
predict_proba
fromAutoSklearnEstimator
, this should only be available in the subclassAutoSklearnClassifier
AutoSklearnRegressor.fit
so it just uses the base class fit method. It was doing nothing but forwarding arguments.AutoSklearn2Classifer
now explicitly sets attributes on the already constructedself.automl_
, where previously it relied on it being constructed duringfit
Main Change
Extra Context
In general, as estimators are build in a two stage process, during
__init__
and then remaining variables infit
, we end up with many attributes set toNone
during the__init__
phase. This causes problems with mypy's typing as it can not garuntee that it will not beNone
when the are used later.There are two solutions to this I can think of:
Move any attributes we can create to
__init__
fromfit
. In general this is probably better practice and safer.Move attributes to a
@property
This second solution works fine when we can use the combo
@property attributed_wanted
withself._attributed_wanted
.It has the issue when we want
@property _attribute_wanted
andself.__attributed_wanted
.While we could have the reference as
__attributed_wanted
, with the double leading underscore, this invokes pythons name scrambling and subclass will not be able to access this attribute directly and will be require to use the@property _attributed_wanted
. This may or may not be a problem.