harborwater's picture
Adding Evaluation Results (#2)
9efec69
metadata
license: apache-2.0
datasets:
  - totally-not-an-llm/EverythingLM-data-V2-sharegpt
language:
  - en
library_name: transformers

Trained on 3 epochs of the totally-not-an-llm/EverythingLM-data-V2-sharegpt dataset.

### HUMAN:
{prompt}

### RESPONSE:
<leave a newline for the model to answer>

note: Changed a few of the finetuning parameters this time around. I have no idea if its any good but Feel free to give it a try!

Built with Axolotl

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 36.29
ARC (25-shot) 42.83
HellaSwag (10-shot) 73.28
MMLU (5-shot) 26.87
TruthfulQA (0-shot) 37.26
Winogrande (5-shot) 66.61
GSM8K (5-shot) 1.59
DROP (3-shot) 5.61