-
Notifications
You must be signed in to change notification settings - Fork 317
Open
Labels
Description
Bitnet 1.58 Groundwork
After some talks with Saroufim and the cuda mode team working on bitnet, we've outlined a strategy for implementing bitnet 1.58 method into torch. This issue lays the groundwork for 2-bit trinary tensor quantization and bitnet linear work for Bitnet 1.58
I've set up a staging repo Staging with a number of items:
- To the point minimal lib
- Training notebook for creating a full model, up to the point where we quantize and pack
- Cleaned up minimal training example for running as a script
- Example of the compiled kernel
This covers the initial groundwork for getting working trinary networks into torch.
- Example Quantization Method
- POC layer quantization
- Runnable example model with quantized layers (In progress Dtype and Runnable Model)
- AO dtype
- AO layer type (?) for bitnet linear
- Runnable example model with full dtype + bitnet linear layer, shippable