legraphista/gemma-2-27b-it-IMat-GGUF · Something may be wrong with this GGUF, specifically IQ4

Hugging Face

Something may be wrong with this GGUF, specifically IQ4_XS?

by Nekotekina - opened 24 days ago

Discussion

Nekotekina

24 days ago

•

edited 24 days ago

Hello. The model behaves strangely for me (some name distortions), it took me few hours to figure out, I tried to download IQ4_XS from bartowski https://huggingface.co/bartowski/gemma-2-27b-it-GGUF/blob/main/gemma-2-27b-it-IQ4_XS.gguf and it doesn't seem to have such issues. It uses different llama.cpp version, maybe it's related?
Sorry that I don't provide any examples, it's a bit complex to reproduce as it's not just a simple chat.

legraphista

Owner 24 days ago

Hey @Nekotekina , thank you for pointing it out

This repo has indeed been quantized with an older version of llama.cpp (b3266) as opposed to the one from bartowski, which was done at b3389

There is a Gemma 2 related PR around that time https://github.com/ggerganov/llama.cpp/pull/8197 / https://huggingface.co/bartowski/gemma-2-27b-it-GGUF/discussions/5

Looks like i will have to re-quant this model

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment