--- license: other language: - en pipeline_tag: text-generation inference: false tags: - transformers - gguf - imatrix - Silicon-Maid-7B --- Quantizations of https://huggingface.co/SanjiWatsuki/Silicon-Maid-7B ### Experiment Quants **ending in "_X"** are experimental quants. These quants are the same as normal quants, but their token embedding weights are set to Q8_0 except for Q6_K and Q8_0 which are set to F16. The change will make these experimental quants larger but in theory, should result in improved performance. List of experimental quants: * Q2_K_X * Q4_K_M_X * Q5_K_M_X * Q6_K_X * Q8_0_X --- ### Inference Clients/UIs * [llama.cpp](https://github.com/ggerganov/llama.cpp) * [JanAI](https://github.com/janhq/jan) * [KoboldCPP](https://github.com/LostRuins/koboldcpp) * [text-generation-webui](https://github.com/oobabooga/text-generation-webui) * [ollama](https://github.com/ollama/ollama) --- # From original readme Silicon-Maid-7B is another model targeted at being both strong at RP **and** being a smart cookie that can follow character cards very well. As of right now, Silicon-Maid-7B outscores both of my previous 7B RP models in my RP benchmark and I have been impressed by this model's creativity. It is suitable for RP/ERP and general use. ### Prompt Template (Alpaca) I found the best SillyTavern results from using the Noromaid template but please try other templates! Let me know if you find anything good. SillyTavern config files: [Context](https://files.catbox.moe/ifmhai.json), [Instruct](https://files.catbox.moe/ttw1l9.json). Additionally, here is my highly recommended [Text Completion preset](https://huggingface.co/SanjiWatsuki/Loyal-Macaroni-Maid-7B/blob/main/Characters/MinP.json). You can tweak this by adjusting temperature up or dropping min p to boost creativity or raise min p to increase stability. You shouldn't need to touch anything else! ``` Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: {prompt} ### Response: ```