dfurman
/

Llama-2-70B-Instruct-v0.1

Text Generation

Model card Files Files and versions Community

dfurman commited on Sep 11, 2023

Commit

3b7a158

•

1 Parent(s): a1190de

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -13,6 +13,21 @@ pipeline_tag: text-generation
 This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf) on the first 25k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single H100 (80 GB PCIe) for roughly 17 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
 * Model license: Llama 2 Community License Agreement
 * Basic usage: [notebook](assets/basic_inference_llama_2_70b_dolphin.ipynb)
 * Finetuning code: [script](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/llama-2/dolphin/sft-llama-2-70b-dolphin-peft.py)

 This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf) on the first 25k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single H100 (80 GB PCIe) for roughly 17 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
+### Benchmark metrics
+| Metric                | Value |
+|-----------------------|-------|
+| MMLU (5-shot)         | 69.18 |
+| ARC (25-shot)         | 69.62 |
+| HellaSwag (10-shot)   | 86.82 |
+| TruthfulQA (0-shot)   | 57.43 |
+| Avg.                  | 70.76 |
+We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
+### Helpful Links
 * Model license: Llama 2 Community License Agreement
 * Basic usage: [notebook](assets/basic_inference_llama_2_70b_dolphin.ipynb)
 * Finetuning code: [script](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/llama-2/dolphin/sft-llama-2-70b-dolphin-peft.py)