File size: 2,401 Bytes
93956cd
4d37390
 
 
93956cd
 
 
 
4d37390
93956cd
 
4d37390
93956cd
 
4d37390
 
 
 
 
 
 
 
93956cd
4d37390
 
93956cd
4d37390
 
 
 
93956cd
4d37390
93956cd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
license: other
license_name: mistral-ai-research-licence
license_link: https://mistral.ai/licenses/MRL-0.1.md
base_model: []
library_name: transformers
tags:
- mergekit
- lumikabra-123B

---
# lumikabra-123B v0.4


<div style="width: auto; margin-left: auto; margin-right: auto; margin-bottom: 3cm">
<img src="https://huggingface.co/schnapper79/lumikabra-123B_v0.1/resolve/main/lumikabra.png" alt="Lumikabra" style="width: 100%; min-width: 400px; display: block; margin: auto;">
</div>

This is lumikabra. It's based on [Mistral-Large-Instruct-2407 ](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407), merged with Magnum-v2-123B, Luminum-v0.1-123B and Tess-3-Mistral-Large-2-123B.

I shamelessly took this idea from [FluffyKaeloky](https://huggingface.co/FluffyKaeloky/Luminum-v0.1-123B). Like him, i always had my troubles with each of the current large mistral based models. 
Either it gets repetitive, shows too many GPTisms, is too horny or too unhorny. RP and storytelling is always a matter of taste, and i found myself swiping too often for new answers or even fixing them when I missed a little spice or cleverness.

Luminum was a great improvement, mixing a lot of desired traits, but I still missed some spice, another sauce.
So i took Luminum, added magnum again and also Tess for knowledge and structure.

This is the forth iteration. More of the mistral base model. Like all Lumikabra models so far, it tends to write pretty long and creative answers.. 

## Merge Details
### Merge Method

This model was merged using [mergekit](https://github.com/cg123/mergekit) with the della_linear merge method using mistralai_Mistral-Large-Instruct-2407 as a base.

### Configuration

The following YAML configuration was used to produce this model:

```yaml
models:
  - model: /workspace/text-generation-webui/models/anthracite-org_magnum-v2-123b
    parameters:
      weight: 0.25
      density: 0.9
  - model: /workspace/text-generation-webui/models/FluffyKaeloky_Luminum-v0.1-123B
    parameters:
      weight: 0.25
      density: 0.9
  - model: /workspace/text-generation-webui/models/migtissera_Tess-3-Mistral-Large-2-123B
    parameters:
      weight: 0.3
      density: 0.9      
merge_method: della_linear
base_model: /workspace/text-generation-webui/models/mistralai_Mistral-Large-Instruct-2407
parameters:
  epsilon: 0.05
  lambda: 1
  int8_mask: true
dtype: bfloat16
```