bhenrym14 commited on
Commit
bce0dd5
1 Parent(s): 195500b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -53,7 +53,7 @@ The results follow.
53
 
54
  - The pretraining successfuly ameliorates the rise in perplexity between 8192 and 16284. Not only that, it outperforms it everywhere.
55
  - For contexts shorter than the original 2048, the original model has lower perplexity. This is consistent with the literature. The gap shrinks with context length, with the original becoming incoherent beyond this point.
56
- - This comparison isn't perfect. I did use the 1.4.1 dataset, the quantization method is slightly different, and the finetuning method is different (QLoRA vs full). In short, there are other potentially influential variables responsible for these performance differences.
57
 
58
  ## Quantization
59
 
 
53
 
54
  - The pretraining successfuly ameliorates the rise in perplexity between 8192 and 16284. Not only that, it outperforms it everywhere.
55
  - For contexts shorter than the original 2048, the original model has lower perplexity. This is consistent with the literature. The gap shrinks with context length, with the original becoming incoherent beyond this point.
56
+ - This comparison isn't perfect. I did use the 1.4.1 dataset and the finetuning method is different (QLoRA vs full). In short, there are other potentially influential variables responsible for these performance differences.
57
 
58
  ## Quantization
59