Text Generation
PEFT
Safetensors
llama-2
Eval Results
dfurman commited on
Commit
c8c857f
1 Parent(s): e88b0d5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -9
README.md CHANGED
@@ -14,7 +14,7 @@ base_model: meta-llama/Llama-2-70b-hf
14
 
15
  This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf) on the first 25k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single H100 (80 GB PCIe) for roughly 17 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
16
 
17
- ### Benchmark metrics
18
 
19
  | Metric | Value |
20
  |-----------------------|-------|
@@ -26,7 +26,7 @@ This instruction model was built via parameter-efficient QLoRA finetuning of [ll
26
 
27
  We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
28
 
29
- ### Helpful links
30
 
31
  * Model license: Llama 2 Community License Agreement
32
  * Basic usage: [notebook](assets/basic_inference_llama_2_dolphin.ipynb)
@@ -40,7 +40,7 @@ We use state-of-the-art [Language Model Evaluation Harness](https://github.com/E
40
 
41
  The above loss curve was generated from the run's private wandb.ai log.
42
 
43
- ### Example prompts and responses
44
 
45
  Example 1:
46
 
@@ -136,7 +136,7 @@ The llama-2-70b models have been modified from a standard transformer in the fol
136
  | sequence length | 4096 |
137
  | grouped-query attention | ✔️ |
138
 
139
- ## PreTraining data
140
 
141
  For more details on the pretraining process, see [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf).
142
 
@@ -150,9 +150,9 @@ This model can produce factually incorrect output, and should not be relied on t
150
  This model was trained on various public datasets.
151
  While great efforts have been taken to clean the pretraining data, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
152
 
153
- ## How to use
154
 
155
- Basic usage: [notebook](assets/basic_inference_llama_2_dolphin.ipynb)
156
 
157
  ```python
158
  !pip install -q -U huggingface_hub peft transformers torch accelerate
@@ -221,8 +221,7 @@ with torch.autocast("cuda", dtype=torch.bfloat16):
221
  print(tokenizer.decode(output["sequences"][0], skip_special_tokens=True))
222
  ```
223
 
224
-
225
- ### Runtime tests
226
 
227
  | runtime / 50 tokens (sec) | GPU | attn | torch dtype | VRAM (GB) |
228
  |:-----------------------------:|:----------------------:|:---------------------:|:-------------:|:-----------------------:|
@@ -253,7 +252,7 @@ The license on this model does not constitute legal advice. We are not responsib
253
 
254
  ---
255
 
256
- ### Framework versions
257
 
258
 
259
  - PEFT 0.5.0.dev0
 
14
 
15
  This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf) on the first 25k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single H100 (80 GB PCIe) for roughly 17 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
16
 
17
+ ## Benchmark metrics
18
 
19
  | Metric | Value |
20
  |-----------------------|-------|
 
26
 
27
  We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
28
 
29
+ ## Helpful links
30
 
31
  * Model license: Llama 2 Community License Agreement
32
  * Basic usage: [notebook](assets/basic_inference_llama_2_dolphin.ipynb)
 
40
 
41
  The above loss curve was generated from the run's private wandb.ai log.
42
 
43
+ ## Example prompts and responses
44
 
45
  Example 1:
46
 
 
136
  | sequence length | 4096 |
137
  | grouped-query attention | ✔️ |
138
 
139
+ ## Pre-training data
140
 
141
  For more details on the pretraining process, see [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf).
142
 
 
150
  This model was trained on various public datasets.
151
  While great efforts have been taken to clean the pretraining data, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
152
 
153
+ ## Basic usage
154
 
155
+ * [notebook](assets/basic_inference_llama_2_dolphin.ipynb)
156
 
157
  ```python
158
  !pip install -q -U huggingface_hub peft transformers torch accelerate
 
221
  print(tokenizer.decode(output["sequences"][0], skip_special_tokens=True))
222
  ```
223
 
224
+ ## Runtime tests
 
225
 
226
  | runtime / 50 tokens (sec) | GPU | attn | torch dtype | VRAM (GB) |
227
  |:-----------------------------:|:----------------------:|:---------------------:|:-------------:|:-----------------------:|
 
252
 
253
  ---
254
 
255
+ ## Framework versions
256
 
257
 
258
  - PEFT 0.5.0.dev0