You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Move Uintx out of prototype for future extension (#635)
Summary:
Thanks @vayuda for adding the initial version of Uintx tensor subclass
we can now integrate this with `torch.uint1` to `torch.uint7` dtypes with some helpers
to unblock the benefit of bitpacking (model size saving) to people first, and then
we can gradually optimize the performance.
Also executorch is planning to integrate their low bit kernels with us, more native experience with
these lower bit types will be required / useful there as well
Test Plan:
python test/dtypes/test_uintx.py
Reviewers:
Subscribers:
Tasks:
Tags:
Copy file name to clipboardExpand all lines: README.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -165,7 +165,9 @@ python setup.py install
165
165
*[DoRA](torchao/prototype/dora) a newer replacement for QLoRA with more promising convergence characteristics
166
166
*[Fused int4/fp16 Quant Matmul](torchao/prototype/hqq) which is particularly useful for compute bound kernels showing 4x speedups over tinygemm for larger batch sizes such as 512
167
167
*[gau-nernst](https://github.com/gau-nernst) fp6 kernels that are 4x faster than fp16 [torchao/prototype/quant_llm](torchao/prototype/quant_llm)
168
-
*[vayuda](https://github.com/vayuda) with generic bitpacking kernels that were code generated using pure PyTorch [prototype/common](torchao/prototype/common)
* generic bitpacking kernels that were code generated using pure PyTorch [prototype/common](torchao/prototype/common)
170
+
*`UintxTensor` that is added to [torch/dtypes](https://github.com/pytorch/ao/tree/main/torchao/dtypes/uintx) as a building block for lower bit dtypes (`uint1` to `uint7`)
169
171
*[andreaskopf](https://github.com/andreaskoepf) and [melvinebenezer](https://github.com/melvinebenezer) with [1 bit LLMs](torchao/prototype/dtypes) Bitnet 1.58 bitpacked into uint2 and fully code-generated with torch.compile
0 commit comments