--- license: mit language: - en pipeline_tag: text-generation --- # dolly-v2-12b-q4 Model Card [dolly-v2-12b](https://huggingface.co/databricks/dolly-v2-12b) converted to GGML format and quantized to 4-bit using https://github.com/NolanoOrg/cformers. ## Running the model [This fork](https://github.com/raymondhs/cformers) has the modification that includes Dolly in the model list. ```python from interface import AutoInference as AI ai = AI("databricks/dolly-v2-12b") prompt_template = """Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: {instruction} ### Response: """ instruction = "Explain to me the difference between nuclear fission and fusion." x = ai.generate(prompt_template.format(instruction=instruction), num_tokens_to_generate=100) print(x['token_str']) ```