--- datasets: - garage-bAInd/Open-Platypus --- # Instruction tune of Mistral-7B-v0.1 with Open-Platypus (fp16) ## Overview This is [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1), with instruction tuning performed with the [garage-bAInd/Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus) dataset. **This is a (merged) QLoRA fine-tune (rank 64)**. The finetune was performed with 1x RTX 6000 Ada (~9 hours). ## How to Use As of writing, the `Mistral` architecture requires installation of `transformers` from source. With this done, load like any other model. ### Benchmarks ARC (25 shot): 62.80 Hellaswag (10 shot): 84.12 MMLU (5 shot): 64.20 ## Context Length - Relative Performance (wikitext perplexity) | Context (tokens) | **bhenrym14/mistral-7b-platypus-fp16** | bhenrym14/airoboros-l2-13b-2.1-YaRN-64k | bhenrym14/airophin-13b-pntk-16k-fp16 | bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-fp16 | jondurbin/airoboros-l2-13b-gpt4-1.4.1 | | --- | --- |--- | ---| ----- | -----| | 512 | **7.22** | 7.64 | 7.62 | 7.90 | 7.23 | | 1024 | 6.04 | 6.15 | 6.20 | 6.17 | **5.85** | | 2048 | 5.50 | 5.29 | 5.38 | 5.23 | **5.07** | | 4096 | 5.05 |4.93 | 5.08 | 4.91 | **4.77** | | 8192 | 4.96 |**4.69** | 4.90 | Not Tested | 57.1 | | 12000 | Not Tested | **4.53** | 4.82 | Not Tested | Not Tested | - While the mistral model is very impressive for its size, particularly on benchmarks, the sliding window attention and/or model size impacts its competitiveness with other context extension techniques applied to larger llama2 and llama variants. Is this is more to do with sliding window attention or model size? ## Prompting: Model was trained with legacy airoboros <2.0 system prompt. See [bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-fp16](https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-fp16) model card for details. # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_bhenrym14__mistral-7b-platypus-fp16) | Metric | Value | |-----------------------|---------------------------| | Avg. | 56.89 | | ARC (25-shot) | 63.05 | | HellaSwag (10-shot) | 84.15 | | MMLU (5-shot) | 64.11 | | TruthfulQA (0-shot) | 45.07 | | Winogrande (5-shot) | 78.53 | | GSM8K (5-shot) | 17.36 | | DROP (3-shot) | 45.92 |