Skip to content

[torchao/ExecuTorch] move embedding quantization to torchao #9514

@metascroy

Description

@metascroy

🚀 The feature, motivation and pitch

Move the EmbeddingQuantizer in ET llama code to torchao and write it using torchao quant primitives. Recombine embedding Q/DQ ops into packed weights during to_executorch.

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

Metadata

Metadata

Assignees

Labels

triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions