Skip to content

Python exercise [03_1_encoding_scaling]: xgboost python is about to support categorical data while R hasn't yet. #35

@sandylaker

Description

@sandylaker

However, it says "the feature is experimental and has limited features. Only the Python package is fully supported". So in R it's not supported and mlr3 also does not support this as of now Afaik. So would propose to keep it to be for now but open an issue that needs to be resolved in future when this categorical support is not experimental anymore. Also it looks like it's basically internally doing one hot encoding: "categorical data the split is defined depending on whether partitioning or onehot encoding is used. For partition-based splits, the splits are specified as
value $\in$ categories , where categories is the set of categories in one feature. If onehot encoding is used instead, then the split is defined as value == category "

Originally posted by @giuseppec in #26 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions