-
Notifications
You must be signed in to change notification settings - Fork 316
Closed
Labels
topic: new featureUse this tag if this PR adds a new featureUse this tag if this PR adds a new featuretopic: performanceUse this tag if this PR improves the performance of a featureUse this tag if this PR improves the performance of a feature
Description
Dear team,
I would like to inquire about the possibility of W4A4 quantization support in torchao.
Torchao has proven to be an excellent quantization inference tool, particularly with its comprehensive support for W8A8. However, regarding 4-bit operations, I've only noticed W4A8 implementation (which currently utilizes INT8 GEMM operators under the hood). Given that many modern GPUs now support INT4 GEMM operators with promising results, I was wondering if there are any plans to implement W4A4 in torchao?
Thank you for your attention to this matter.
Best regards
Metadata
Metadata
Assignees
Labels
topic: new featureUse this tag if this PR adds a new featureUse this tag if this PR adds a new featuretopic: performanceUse this tag if this PR improves the performance of a featureUse this tag if this PR improves the performance of a feature