--- license: cc-by-nc-4.0 --- # 42dot_LLM-SFT-1.3B_GGUF # * Model Creator: [42dot](https://huggingface.co/42dot) * original Model: [42dot_LLM-SFT-1.3B](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B) ## Description ## This repository contains the GGUF conversion and the most relevant quantizations of 42dot's [42dot_LLM-SFT-1.3B](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B) model - ready to be used with [llama.cpp](https://github.com/ggerganov/llama.cpp) and similar applications. ## Files ## In order to allow for fine-tuning (the model has the required LLaMA architecture) the original GGUF conversion has been made available * [42dot_LLM-SFT-1.3B.gguf](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B.gguf) From this file, the following quantizations were derived: * [42dot_LLM-SFT-1.3B-Q4_K_M](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q5_K_M.gguf) * [42dot_LLM-SFT-1.3B-Q5_K_M](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q5_K_M.gguf) * [42dot_LLM-SFT-1.3B-Q6_K](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q6_0.gguf) * [42dot_LLM-SFT-1.3B-Q8_K](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q8_0.gguf) (tell me if you need more) ## Usage Details ## Any technical details can be found on the [original model card](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B) The most important ones for using this model are * context length is 4096 * there does not seem to be a specific prompt structure - just provide the text you want to be completed ### Text Completion with LLaMA.cpp ### For simple inferencing, use a command similar to ``` ./main -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --temp 0 --top-k 4 --prompt "who was Joseph Weizenbaum?" ``` ### Text Tokenization with LLaMA.cpp ### To get a list of tokens, use a command similar to ``` ./tokenization -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --prompt "who was Joseph Weizenbaum?" ``` ### Embeddings Calculation with LLaMA.cpp ### Text embeddings are calculated with a command similar to ``` ./embedding -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --prompt "who was Joseph Weizenbaum?" ``` ## License ## The original model "_is licensed under the Creative Commons Attribution-NonCommercial 4.0 (CC BY-NC 4.0)_" - for that reason, the same license was also chosen for the conversions found in this repository. So, in order to be fair and give credits to whom they belong: * the original model was created and published by [42dot](https://huggingface.co/42dot) * besides quantization, no changes were applied to the model itself