File size: 9,318 Bytes
731d42e
 
 
 
 
 
 
 
 
 
 
 
 
 
ac2a23b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
731d42e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
---
base_model: Sao10K/MN-BackyardAI-Party-12B-v1
language:
- en
license: cc-by-nc-4.0
tags:
- llama-cpp
- gguf-my-repo
---

# Triangle104/MN-BackyardAI-Party-12B-v1-Q4_K_M-GGUF
This model was converted to GGUF format from [`Sao10K/MN-BackyardAI-Party-12B-v1`](https://huggingface.co/Sao10K/MN-BackyardAI-Party-12B-v1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
Refer to the [original model card](https://huggingface.co/Sao10K/MN-BackyardAI-Party-12B-v1) for more details on the model.

Model Info:
---
---
Trained with compute from Backyard.ai | Thanks to them and @dynafire for helping me out.

Trained on 2x A100 SXM 40GB as an 8-bit LoRA.

This is a group-chat based roleplaying model, based off of 12B-Lyra-v4a2, a variant of Lyra-v4 that is currently private.

It is trained on an entirely human-based dataset, based on forum / internet group roleplaying styles. The only augmentation done with LLMs is to the character sheets, to fit to the system prompt, to fit various character sheets within context.

This model is still capable of 1 on 1 roleplay, though I recommend using ChatML when doing that instead.

---

Formatting:

Training for the multi-character roleplaying format is done with a variant of ChatML, replaced with [INST] blocks formatted as such. Use this to draw in more of the training done.

[INST]system
System Prompt Here[/INST]
[INST]user
User's Yapping[/INST]
[INST]model
Model Reply[/INST]

Relevant!
- Turns do not need to respect user -> model -> user. Training is done with disjointed turns that may have repeating turns to simulate real group roleplay / chat scenarios with multiple users.
- Additional work may be required to fit for your front-end.
- Ideally character cards are all included in the turns. Training is done with this in mind. Below on the page has relevant information.
- This is a Nemo model, so lower Temperature and a sprinkling of min_p helps.
- This does require a lot of tinkering to fit within SillyTavern / other frontends.

To get better performance on Regular 1 on 1 Roleplay or Chat scenarios, use ChatML to get more of Lyra's performance.

<|im_start|>system
System Prompt Here.<|im_end|>
<|im_start|>user
User's Instructions<|im_end|>
<|im_start|>assistant
Model Response<|im_end|>

For best results, set both <|im_end|> and [INST] as stopping strings. Recommended Temperature is <1 , min_p of ateast 0.1
Dataset Information:

This dataset is made from a human RP forum source, trimmed down, augmented and reformatted to fit.
- Each entry has a minimum of 6 turns to be inside
- Number of unique/main characters are ranged from 2 to 7 characters per entry.
- Each conversation is kept as is to preserve quality and uniqueness of the human data.
- Only the added system prompt makes use of the current character sheets given.

The following below is how the current Character Card / Sheets is done, which are augmented from the messy and non-uniform character sheets available. To get best results, please reformat your current character data to the on as seen below, or as similar as you can if possible.

- **Character Name**: 
- **Age**:
- **Race**:
- **Mageblood Type**: (if applicable)
- **Favored Magic Class**: (if applicable)
- **Previous Magic Training**: (if applicable)
- **Occupation/Profession**: (if applicable)
- **Appearance**: (if applicable)
- **Biography**: (if applicable)
- **Good Attributes**: (if applicable)
- **Bad Attributes**: (if applicable)
- **Equipment**: (if applicable)
- **Other Information**: (if applicable)

Here is an example based on the above format:

**Character Name**: Keri Wolf  
**Age**: 21  
**Race**: Vampire  
**Mageblood Type**: Hydromancy  
**Favored Magic Class**: Aqua  
**Previous Magic Training**: Novice  
**Occupation/Profession**: None specified  

**Appearance**:  
- Height: 5'9"  
- A wooden wolf necklace around her neck, contrasting with her pale skin  
- Three swords strapped to her waist  
- A tattoo of a thorn vine, her family crest, on her right arm  
- Normal eye color is red but changes based on her mood or the topic of conversation  
- Carries a hunk of wood and a carving knife for personal activities  

**Biography**:  
Keri Wolf grew up in a family of adopted siblings in Djarkel. She had a normal childhood, with her best friend Satori, and was taught basic self-defense by her father. Her brothers were considered troublemakers but remained close to her. On her 21st birthday, her family was slaughtered by a vampire nest, and she was bitten. This led to her developing vampiric traits and seeking answers at the college.

**Good Attributes**:  
- Easy-going  
- Observant  
- Helps those in trouble  
- Soft-hearted  
- Kind  
- Cool-headed  
- Good at getting out of difficult situations  
- Avoids violence  
- Gets along well with different people  
- Loves animals  

**Bad Attributes**:  
- Sunlight sensitivity  
- Hatred towards vampires outside the college  
- Keeps feelings in check, leading to dangerous outbursts  
- Cruel manner of speaking  
- Thirst for revenge  

**Equipment**:  
- Wooden wolf necklace  
- Three swords (one engraved with a rose, one engraved with her father's name, and one for decoration)  
- Carving knife  
- Hunk of wood  
- Stealth Ring  
- Knight's Shield  

**Other Information**:  
- Secret word: rebirth

The following system prompt is augmented from available character sheets, or details from the original dataset. Placeholder names are given as shown.

You are involved in a multi-character internet-style roleplaying session with a human user, who is playing as Ballbuster Steve. Do not generate dialogue for the user's character, Ballbuster Steve. Focus on the other characters.

[Human User]  
Ballbuster Steve # {user}
Character Bio: [Steve's bio]

[Involved Characters] 
Altair "Arty" Enzo # {char1}
Character Bio: [Arty's bio]
--- 
Sukuna Gojo # {char2}
Character Bio: [Sukuna's bio]
---

The roleplay begins now.

This is how some of the turn example looks like, newlines are only for visual use.

[INST]user
Ballbuster Steve: Being the doorman at a nightclub, especially one as popular as LUSH... [/INST]

[INST]model
Altair "Arty" Enzo: While he was waiting for Jake to answer, Arty noticed from the corner of his eye... [/INST]

[INST]model
Sukuna Gojo: Nick was now out of his element; he just came off his portable radio app... [/INST]

[INST]user
Ballbuster Steve: Steve grabbed his black clutch from where it was stashed under the mixing desk... [/INST]

To make it easier, this is how I'd format responses for the backend:

<s>[INST]system
{system_prompt}[/INST]
[INST]user
{user}: {text}[/INST]
[INST]model
{char1}: {text}[/INST]
[INST]model
{char2}: {text}[/INST]
[INST]user
{user}: {text}[/INST]
[INST]model
{char1}: {text}[/INST]<|im_end|> # For Final Turn only. Alternatively, set <|im_end|> as a stopping string.

Current Issues:

- Impersonation - This is a common side-effect of pure human roleplaying data, unfortunately.
    Users do like writing the actions of others, though this is more limited to end of reply.
- Varied Output Quality - A swipe should be enough?
    I only removed obviously bad entries. Output quality varies thanks to the variety of human users involved.
- Character Detail Confusion when in group chats
    This rarely happens, but it is usually when there are too many main characters, or the bio is improperly formatted and seperated.
    Or if you're using an additional, complex system prompt.
- Random OOC / Story Break moments may still exist despite me filtering the data.
- Limited Dataset Size -> 4K Varied Samples ranging from 2-7 characters per entry. I'm looking to expand.
- Limited System Prompt? -> I'm trying to improve on this.
- Fantasy-bias? -> Most of the entries are fantasy-based after all.

Training Metrics

n_sample: 4000
n_gpu: 2
global batch size: 12
lora: bnb_8bit
no. epochs: 3
lr: 0.000004
lr_scheduler: cosine
deepspeed: zero2

---
## Use with llama.cpp
Install llama.cpp through brew (works on Mac and Linux)

```bash
brew install llama.cpp

```
Invoke the llama.cpp server or the CLI.

### CLI:
```bash
llama-cli --hf-repo Triangle104/MN-BackyardAI-Party-12B-v1-Q4_K_M-GGUF --hf-file mn-backyardai-party-12b-v1-q4_k_m.gguf -p "The meaning to life and the universe is"
```

### Server:
```bash
llama-server --hf-repo Triangle104/MN-BackyardAI-Party-12B-v1-Q4_K_M-GGUF --hf-file mn-backyardai-party-12b-v1-q4_k_m.gguf -c 2048
```

Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.
```
git clone https://github.com/ggerganov/llama.cpp
```

Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
```
cd llama.cpp && LLAMA_CURL=1 make
```

Step 3: Run inference through the main binary.
```
./llama-cli --hf-repo Triangle104/MN-BackyardAI-Party-12B-v1-Q4_K_M-GGUF --hf-file mn-backyardai-party-12b-v1-q4_k_m.gguf -p "The meaning to life and the universe is"
```
or 
```
./llama-server --hf-repo Triangle104/MN-BackyardAI-Party-12B-v1-Q4_K_M-GGUF --hf-file mn-backyardai-party-12b-v1-q4_k_m.gguf -c 2048
```