Google Colab RAM crash, after 4-bit quantization.

#68
by BlankHead - opened

What is the maximum quantization that can be done and what is the minimum RAM size required to load this quantized model. Coz its not working in 4-bit quantization.

Sign up or log in to comment