Can this model be quantized using bitsandbytes or some other method?

#30
by RonanMcGovern - opened

I see that gguf quantization is possible, which means running llama.cpp is possible, but is there a way to do the same with transformers or sentence transformers? Thanks

Sign up or log in to comment