How to run the model OpenGVLab/InternVL2-40B-AWQ with vllm docker image?

#2
by andryevinnik - opened

I used the following command:
docker run --log-opt max-size=10m --log-opt max-file=1 --rm -it --gpus '"device=0"' -p 8080:8000 --mount type=bind,source=/ssd_2/huggingface,target=/root/.cache/huggingface vllm/vllm-openai:v0.5.4 --model OpenGVLab/InternVL2-40B-AWQ --max-model-len 8192 --trust-remote-code

KeyError: 'model.layers.51.mlp.gate_up_proj.qweight'

Tried command with additional: --dtype half
the same error

Also tried: -q awq
ValueError: Cannot find the config file for awq

Please advise how to run the model

OpenGVLab org

Our awq model is generated by lmdeploy. Vllm recently supports Internvl. Please try again.

czczup changed discussion status to closed

Sign up or log in to comment