Efficient-Large-Model/Llama-3-LongVILA-8B-1024Frames

Whether I am using model = AutoModel.from_pretrained("Efficient-Large-Model/Llama-3-LongVILA-8B-1024Frames") to download the model, or downloading the model to load it locally, the following error occurs. This error has been mentioned in the paper's GitHub issue (https://github.com/NVlabs/VILA/issues/135), but it has not been resolved so far. Could you please help? Thank you.

File "/eval_vision_niah.py", line 250, in inference
tokenizer = AutoTokenizer.from_pretrained(
File "/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 819, in from_pretrained
config = AutoConfig.from_pretrained(
File "/.local/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 947, in from_pretrained
raise ValueError(
ValueError: The checkpoint you are trying to load has model type llava_llama but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

Efficient-Large-Model
/

Llama-3-LongVILA-8B-1024Frames

load model error