Update README.md
Browse files
README.md
CHANGED
@@ -2,4 +2,19 @@
|
|
2 |
license: other
|
3 |
license_name: yi-license
|
4 |
license_link: LICENSE
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: other
|
3 |
license_name: yi-license
|
4 |
license_link: LICENSE
|
5 |
+
datasets:
|
6 |
+
- adamo1139/rawrr_v1
|
7 |
+
tags:
|
8 |
+
- dpo
|
9 |
+
- qlora
|
10 |
+
- unsloth
|
11 |
---
|
12 |
+
Another QLoRA DPO training of Yi-34B-200K.
|
13 |
+
This time with sequence length 500, lora_r 16 and lora alpha 32.
|
14 |
+
I was able to squeeze that in using Unsloth, script I used is in this repo.
|
15 |
+
It definitely has much stronger effect than my previous one that was with lora_r 4, lora_alpha 8 and sequence length 200, but I am not sure if I didn't overcook it.
|
16 |
+
Will try to train this on AEZAKMI v2 now.
|
17 |
+
|
18 |
+
Credits for mlabonne (I was using his Mistral fine-tuning script pieces for dataset preparation), Daniel Han and Michael Han (Unsloth AI team)
|
19 |
+
|
20 |
+
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" alt="made with Unsloth" width="400" height="64"/>](https://github.com/unslothai/unsloth)
|