brucethemoose
/

jondurbin_bagel-dpo-34b-v0.2-exl2-4bpw-fiction

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

brucethemoose commited on Jan 2

Commit

396c5dd

•

1 Parent(s): bdb2e3d

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -42,7 +42,7 @@ Just a fiction oriented 4bpw exl2 quantization of https://huggingface.co/jondurb
 Quantized on 300K tokes of two Vicuna format chats, a sci fi story and a fiction story at a long context. This should yield better storywriting performance than the default exl2 quantization.
 ***
 ## Running
 Being a Yi model, try running a lower temperature with ~0.05 MinP, a little repitition penalty, maybe mirostat with a low tau, and no other samplers. Yi tends to run "hot" by default.

 Quantized on 300K tokes of two Vicuna format chats, a sci fi story and a fiction story at a long context. This should yield better storywriting performance than the default exl2 quantization.
+Just ask if anyone wants sizes other than 4bpw, for more/less context or smaller GPUs.
 ***
 ## Running
 Being a Yi model, try running a lower temperature with ~0.05 MinP, a little repitition penalty, maybe mirostat with a low tau, and no other samplers. Yi tends to run "hot" by default.