Using the model in ctransformers

#1
by PatrickSchwabl - opened

Hey bartowski,

I am trying to use your quantized version in ctransformers but I can't get it to load correctly.

from ctransformers import AutoModelForCausalLM

try:
llm = AutoModelForCausalLM.from_pretrained(
"bartowski/Llama-3.1-SauerkrautLM-8b-Instruct-GGUF",
model_file="Llama-3.1-SauerkrautLM-8b-Instruct-Q6_K_L.gguf",
model_type="llama",
gpu_layers=0,
context_length=2048
)
print("Model loaded successfully")
except Exception as e:
print(f"Error loading model: {e}")

This will keep throwing: "RuntimeError: Failed to create LLM 'llama2' from '/home/jovyan/.cache/huggingface/hub/models--bartowski--Llama-3.1-SauerkrautLM-8b-Instruct-GGUF/blobs/7a5b0d4528966fd00dd378b8edf40e96dd65e839078feac7d4f4ab383fbe551b'."

So, the model_type seems to be the issue.
It throws the same for "llama2", "llama3" and None. Google and Copilot could not help me.

What do I have to use a model type?

Thanks!

it's possible ctransformers was never updated with llama3 support as there haven't been any commits to it since september of last year :(

maybe try llama-cpp-python?

All right, thanks for the quick reply. I'll try!

Sign up or log in to comment