OpenGVLab/InternVL2-40B-AWQ · How to run the model OpenGVLab/InternVL2-40B-AWQ with vllm docker image?

Aug 6

I used the following command:
docker run --log-opt max-size=10m --log-opt max-file=1 --rm -it --gpus '"device=0"' -p 8080:8000 --mount type=bind,source=/ssd_2/huggingface,target=/root/.cache/huggingface vllm/vllm-openai:v0.5.4 --model OpenGVLab/InternVL2-40B-AWQ --max-model-len 8192 --trust-remote-code

KeyError: 'model.layers.51.mlp.gate_up_proj.qweight'

Tried command with additional: --dtype half
the same error

Also tried: -q awq
ValueError: Cannot find the config file for awq

Please advise how to run the model

qwertyjack

Aug 14

Why not trying LMDeploy: https://lmdeploy.readthedocs.io/en/latest/llm/api_server.html

zwgao

OpenGVLab org Aug 21

Our awq model is generated by lmdeploy. Vllm recently supports Internvl. Please try again.

czczup changed discussion status to closed Sep 4