brucethemoose
commited on
Commit
•
57250a5
1
Parent(s):
16da39e
Update README.md
Browse files
README.md
CHANGED
@@ -97,7 +97,7 @@ Dare Ties is also resulting in seemingly better, lower perplexity merges than a
|
|
97 |
|
98 |
SUS Chat is not a 200K model, hence it was merged at a very low density to try and preserve Yi 200K's long context performance while still inheriting some of SUS's performance.
|
99 |
|
100 |
-
Dolphin 200K was taken out of
|
101 |
|
102 |
I chose not to include other finetunes because they aren't trained on the 200K base. If any other 200K finetunes pop up, let me know.
|
103 |
***
|
|
|
97 |
|
98 |
SUS Chat is not a 200K model, hence it was merged at a very low density to try and preserve Yi 200K's long context performance while still inheriting some of SUS's performance.
|
99 |
|
100 |
+
Dolphin 200K was taken out of this merge because it seems to be performing poorly for a 34B Dolphin model, like something went wrong during training?
|
101 |
|
102 |
I chose not to include other finetunes because they aren't trained on the 200K base. If any other 200K finetunes pop up, let me know.
|
103 |
***
|