kromeurus's picture
Update README.md
75e90cf verified
metadata
base_model:
  - Sao10K/L3-8B-Niitama-v1
  - Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
  - ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
  - nothingiisreal/L3-8B-Celeste-V1.2
library_name: transformers
tags:
  - mergekit
  - merge

image/jpeg

1/2 of the 13B models for Horizon Anteros. This merge was originally suppose to be only for that final model, but this guy is surprisingly competent. A tad jank, but very solid for what it is on its own. Still experimental.

Quants

OG Q8 GGUF by me.

GGUFs by mradermacher

Details & Recommended Settings

(Still testing; nothing here is finalized.)

Follows intructions fairly well for RP and eRP. Dramatic as fuck at times, depending on the senario. Human dialogue and lots of it.

Fucking hates high temps. Please stick to the recommended setings or else it will break and fast.

Rec. Settings:

Template: Model Default
Temperature: 1.24
Min P: 0.115
Repeat Penelty: 1.05
Repeat Penelty Tokens: 256

Models Merged & Merge Theory

The following models were included in the merge:

So you're not suppose to mix models with different trained context limits, but I did it anyway. Wanted the 'human' output of Celeste v1.2 while curbing the repitition and adding some back up from Niitama and Hathor Tahsin. Formax was included in the beginning for it's instruct following.

Took a page out of @matchaaaaa's Chaifighter Latte and took out a slice of Celeste and Nittama in the center for smoothing out layer disparity. I realized while testing that using that 'splice' metheod, you could theoretically make a pretty big model then squish it down to streamline the layers. So, after much testing, I came up with the following merges.

Config

models:
slices:
- sources:
  - layer_range: [14, 20]
    model: nothingiisreal/L3-8B-Celeste-V1.2
parameters:
  int8_mask: true
merge_method: passthrough
dtype: bfloat16
name: celeste14-20.sl
---
models:
slices:
- sources:
  - layer_range: [14, 20]
    model: Sao10K/L3-8B-Niitama-v1
parameters:
  int8_mask: true
merge_method: passthrough
dtype: bfloat16
name: niitama14-20.sl
---
models: 
  - model: celeste14-20.sl
    parameters:
      weight: [1, 0.75, 0.625, 0.5, 0.375, 0.25, 0]
  - model: niitama14-20.sl
    parameters:
      weight: [0, 0.25, 0.375, 0.5, 0.625, 0.75, 1]
merge_method: dare_linear
base_model: celeste14-20.sl
dtype: bfloat16
name: celeniit14-20.sl
---
models:
slices:
- sources:
  - layer_range: [0, 4]
    model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
- sources:
  - layer_range: [1, 5]
    model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
- sources:
  - layer_range: [4, 8]
    model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
- sources:
  - layer_range: [5, 9]
    model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
- sources:
  - layer_range: [8, 10]
    model: Sao10K/L3-8B-Niitama-v1
- sources:
  - layer_range: [6, 14]
    model: nothingiisreal/L3-8B-Celeste-V1.2
- sources:
  - layer_range: [0, 6]
    model: celeniit14-20.sl
- sources:
  - layer_range: [20, 23]
    model: Sao10K/L3-8B-Niitama-v1
- sources:
  - layer_range: [22, 26]
    model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
- sources:
  - layer_range: [22, 28]
    model: nothingiisreal/L3-8B-Celeste-V1.2
- sources:
  - layer_range: [25, 27]
    model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
- sources:
  - layer_range: [28, 30]
    model: Sao10K/L3-8B-Niitama-v1
- sources:
  - layer_range: [25, 32]
    model: nothingiisreal/L3-8B-Celeste-V1.2
parameters:
  int8_mask: true
merge_method: passthrough
dtype: bfloat16