Something may be wrong with this GGUF, specifically IQ4_XS?

#2
by Nekotekina - opened

Hello. The model behaves strangely for me (some name distortions), it took me few hours to figure out, I tried to download IQ4_XS from bartowski https://huggingface.co/bartowski/gemma-2-27b-it-GGUF/blob/main/gemma-2-27b-it-IQ4_XS.gguf and it doesn't seem to have such issues. It uses different llama.cpp version, maybe it's related?
Sorry that I don't provide any examples, it's a bit complex to reproduce as it's not just a simple chat.

Hey @Nekotekina , thank you for pointing it out

This repo has indeed been quantized with an older version of llama.cpp (b3266) as opposed to the one from bartowski, which was done at b3389

There is a Gemma 2 related PR around that time https://github.com/ggerganov/llama.cpp/pull/8197 / https://huggingface.co/bartowski/gemma-2-27b-it-GGUF/discussions/5

Looks like i will have to re-quant this model

Sign up or log in to comment