lm head weights missing?

#3
by mdoeir - opened

Hi, the lm_head's weights seem to be missing from the repo?

op = safe_open("model.safetensors", framework="pt")
print([k for k in op.keys() if "head" in k]) -> gives an empty list.

Just to clarify, I'm not using the transformers lib from your branch, but just looking into the implementation here.

Another Question, in the llava-ov's official checkpoint, the vocab_size should be 151646, but its 152000 here?
https://huggingface.co/llava-hf/llava-onevision-qwen2-0.5b-ov-hf/blob/66d62be83612d49460229cc26117ec2e81f802ef/config.json#L173
I understand that this should not affect the inference result as long as we use the correct tokenizer, but I'm just curious why it is.

Llava Hugging Face org

Hey! Yes, the lm head weight is not saved because the weights of lm head and embedding layer are tied, in other words they share the same parameters.

The embedding shape is changed because we added a special image token to denote where image embeddings should be and padded the rest for efficient computation. It will not affect the result as long as the tokenization is correct :)

I see. Many thanks!

mdoeir changed discussion status to closed

Sign up or log in to comment