You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Are there fine-tuning and inference scripts available for int4 quantization in bloom-7b? Is it possible to limit the GPU memory usage to within 10GB? #10
I noticed that int8 quantization is available, but is there an option for int4 quantization?
What is the memory overhead for int4 and int8 when using LoRA or PTuning fine-tuning?
Additionally, are there inference scripts available for int4 quantization? How much GPU memory is required for int4 and int8 inference, respectively?