Text Generation
PEFT
Safetensors
llama-2
Eval Results
dfurman commited on
Commit
3b7a158
1 Parent(s): a1190de

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -13,6 +13,21 @@ pipeline_tag: text-generation
13
 
14
  This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf) on the first 25k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single H100 (80 GB PCIe) for roughly 17 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  * Model license: Llama 2 Community License Agreement
17
  * Basic usage: [notebook](assets/basic_inference_llama_2_70b_dolphin.ipynb)
18
  * Finetuning code: [script](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/llama-2/dolphin/sft-llama-2-70b-dolphin-peft.py)
 
13
 
14
  This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf) on the first 25k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single H100 (80 GB PCIe) for roughly 17 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
15
 
16
+ ### Benchmark metrics
17
+
18
+ | Metric | Value |
19
+ |-----------------------|-------|
20
+ | MMLU (5-shot) | 69.18 |
21
+ | ARC (25-shot) | 69.62 |
22
+ | HellaSwag (10-shot) | 86.82 |
23
+ | TruthfulQA (0-shot) | 57.43 |
24
+ | Avg. | 70.76 |
25
+
26
+ We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
27
+
28
+ ### Helpful Links
29
+
30
+
31
  * Model license: Llama 2 Community License Agreement
32
  * Basic usage: [notebook](assets/basic_inference_llama_2_70b_dolphin.ipynb)
33
  * Finetuning code: [script](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/llama-2/dolphin/sft-llama-2-70b-dolphin-peft.py)