galqiwi's picture
Update README.md
4cea3cc verified
|
raw
history blame contribute delete
No virus
935 Bytes
metadata
library_name: transformers
tags:
  - aqlm
  - llama
  - facebook
  - meta
  - llama-3
  - conversational
  - text-generation-inference
base_model: meta-llama/Meta-Llama-3.1-70B-Instruct

Official AQLM quantization of meta-llama/Meta-Llama-3.1-70B-Instruct finetuned with PV-Tuning.

For this quantization, we used 1 codebook of 16 bits and groupsize of 8.

Results:

Model Quantization MMLU (5-shot) ArcC ArcE Hellaswag PiQA Winogrande Model size, Gb
fp16 0.8213 0.6246 0.8683 0.6516 0.8313 0.7908 141
1x16g8 0.7814 0.5478 0.8270 0.6284 0.8036 0.7814 21.9