bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF · Wrong number of tensors when loading the model.

Aug 7

Hi!
I've downloaded the bf16 version of the model and am now trying to load it using LlamaCpp. And I get this error:

llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 292, got 291
llama_load_model_from_file: failed to load model

I have to mention that this is not a bug in my code, because I've successfully loaded other models using the same code. It seems to me that it's some bug in model config(?).

The Q8 version also fails to load with exactly the same error.

bullerwins

Owner Aug 7

what commit version of llama.cpp do you have?

ddh0

Aug 8

yeah you'll need to update to a more recent llama.cpp commit, that extra tensor is the rope frequencies tensor which was added kinda recently

CharlaDev

Aug 8

•

edited Aug 8

Solution

Indeed, upgrading LlamaCpp (and LangChain wrapper) fixed the issue.

CMAKE_ARGS="-DGGML_CUDA=on" pip install --upgrade --force-reinstall --no-cache-dir llama-cpp-python
pip install langchain-core langchain-community langchain

CharlaDev changed discussion status to closed Aug 8