adamo1139 commited on
Commit
8248694
1 Parent(s): 91a3383

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -2,4 +2,19 @@
2
  license: other
3
  license_name: yi-license
4
  license_link: LICENSE
 
 
 
 
 
 
5
  ---
 
 
 
 
 
 
 
 
 
 
2
  license: other
3
  license_name: yi-license
4
  license_link: LICENSE
5
+ datasets:
6
+ - adamo1139/rawrr_v1
7
+ tags:
8
+ - dpo
9
+ - qlora
10
+ - unsloth
11
  ---
12
+ Another QLoRA DPO training of Yi-34B-200K.
13
+ This time with sequence length 500, lora_r 16 and lora alpha 32.
14
+ I was able to squeeze that in using Unsloth, script I used is in this repo.
15
+ It definitely has much stronger effect than my previous one that was with lora_r 4, lora_alpha 8 and sequence length 200, but I am not sure if I didn't overcook it.
16
+ Will try to train this on AEZAKMI v2 now.
17
+
18
+ Credits for mlabonne (I was using his Mistral fine-tuning script pieces for dataset preparation), Daniel Han and Michael Han (Unsloth AI team)
19
+
20
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" alt="made with Unsloth" width="400" height="64"/>](https://github.com/unslothai/unsloth)