What is the expected inference steps after I apply torchao in training? 

Hello, I have integrated torchao to my training. But I don't think it's 100% clear what the inference should be like.

Should I use the converted FP8 linear layer to do inference? Is delayed scaling supposed to work in inference?
Or, should I use the original linear layer to do inference?

Thanks a lot in advance if you can help to clarify!