Skip to content

What is the expected inference steps after I apply torchao in training?
 #1132

@goldhuang

Description

@goldhuang

Hello, I have integrated torchao to my training. But I don't think it's 100% clear what the inference should be like.

Should I use the converted FP8 linear layer to do inference? Is delayed scaling supposed to work in inference?
Or, should I use the original linear layer to do inference?

Thanks a lot in advance if you can help to clarify!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions