bhenrym14
/

airoboros-7b-gpt4-1.4.1-lxctx-PI-16384-GPTQ

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

bhenrym14 commited on Jul 10, 2023

Commit

bce0dd5

•

1 Parent(s): 195500b

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -53,7 +53,7 @@ The results follow.
 - The pretraining successfuly ameliorates the rise in perplexity between 8192 and 16284. Not only that, it outperforms it everywhere.
 - For contexts shorter than the original 2048, the original model has lower perplexity. This is consistent with the literature. The gap shrinks with context length, with the original becoming incoherent beyond this point.
-- This comparison isn't perfect. I did use the 1.4.1 dataset, the quantization method is slightly different, and the finetuning method is different (QLoRA vs full). In short, there are other potentially influential variables responsible for these performance differences.
 ## Quantization

 - The pretraining successfuly ameliorates the rise in perplexity between 8192 and 16284. Not only that, it outperforms it everywhere.
 - For contexts shorter than the original 2048, the original model has lower perplexity. This is consistent with the literature. The gap shrinks with context length, with the original becoming incoherent beyond this point.
+- This comparison isn't perfect. I did use the 1.4.1 dataset and the finetuning method is different (QLoRA vs full). In short, there are other potentially influential variables responsible for these performance differences.
 ## Quantization