asyafiqe
/

Merak-7B-v3-Mini-Orca-Indo

Text Generation

text-generation-inference

Model card Files Files and versions Community

asyafiqe commited on Aug 26, 2023

Commit

ce9b2c0

•

1 Parent(s): 7d0a716

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -6,11 +6,12 @@ language:
 - en
 - id
 ---
-## 🦚Merak-7B-v3-Mini-Orca🐳
 **Merak-7B-v3-Mini-Orca** is Ichsan2895's [Merak-7B-v3](https://huggingface.co/Ichsan2895/Merak-7B-v3) fine-tuned on psmathur's [orca_mini_v1_dataset](https://huggingface.co/datasets/psmathur/orca_mini_v1_dataset). Dataset was machine translated into Bahasa Indonesia with Google Translate.
-[![Axolotl](https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png)](https://github.com/OpenAccess-AI-Collective/axolotl)
 #### Training details
 Merak-7B-v3-Mini-Orca was instruction fine-tuned on 2 x 3090-24GB for 6 hours. [LoRA](https://github.com/microsoft/LoRA), [DeepSpeed ZeRO-2](https://github.com/microsoft/DeepSpeed), and [FlashAttention](https://github.com/Dao-AILab/flash-attention) were implemented during training using [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).
 Hyperparameter | value |
@@ -27,6 +28,7 @@ lora rank |	16 |
 lora dropout |	0.05 |
 lora target modules |	q_proj, v_proj, k_proj, o_proj |
 cutoff length |	4096 |
 #### Training loss
 Step |Train Loss
 | ------ | ------ |

 - en
 - id
 ---
+# 🦚Merak-7B-v3-Mini-Orca🐳
 **Merak-7B-v3-Mini-Orca** is Ichsan2895's [Merak-7B-v3](https://huggingface.co/Ichsan2895/Merak-7B-v3) fine-tuned on psmathur's [orca_mini_v1_dataset](https://huggingface.co/datasets/psmathur/orca_mini_v1_dataset). Dataset was machine translated into Bahasa Indonesia with Google Translate.
+[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
 #### Training details
 Merak-7B-v3-Mini-Orca was instruction fine-tuned on 2 x 3090-24GB for 6 hours. [LoRA](https://github.com/microsoft/LoRA), [DeepSpeed ZeRO-2](https://github.com/microsoft/DeepSpeed), and [FlashAttention](https://github.com/Dao-AILab/flash-attention) were implemented during training using [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).
 Hyperparameter | value |
 lora dropout |	0.05 |
 lora target modules |	q_proj, v_proj, k_proj, o_proj |
 cutoff length |	4096 |
 #### Training loss
 Step |Train Loss
 | ------ | ------ |