Skip to content

Add support for OptimizerParamGroup #495

@lostmsu

Description

@lostmsu

OptimizerParamGroup is used to set different learning rates, weight decay rates, and other optimizer properties differently for different model parameters.

Exmaple: AdamW(vector<OptimizerParamGroup> param_groups, AdamWOptions defaults)

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions