Edit model card

Description GGUF Format model files for this project.

From @bogdan1: Llama-2-7b-base fine-tuned on the Chitanka dataset and a dataset made of scraped news comments dating mostly from 2022/2023. Big Thank you :)

About GGUF

Introduction:

GGUF was introduced by the llama.cpp team on August 21st, 2023, as a replacement for GGML, which is no longer supported. GGUF is a successor file format to GGML, GGMF, and GGJT. It is designed to provide a comprehensive solution for model loading, ensuring unambiguous data representation while offering extensibility to accommodate future enhancements. GGUF eliminates the need for disruptive changes, introduces support for various non-llama models such as falcon, rwkv, and bloom, and simplifies configuration settings by automating prompt format adjustments.

Key Features:

  1. No More Breaking Changes: GGUF is engineered to prevent compatibility issues with older models, ensuring a seamless transition from previous file formats like GGML, GGMF, and GGJT.

  2. Support for Non-Llama Models: GGUF extends its compatibility to a wide range of models beyond llamas, including falcon, rwkv, bloom, and more.

  3. Streamlined Configuration: Say goodbye to complex settings like rope-freq-base, rope-freq-scale, gqa, and rms-norm-eps. GGUF simplifies the configuration process, making it more user-friendly.

  4. Automatic Prompt Format: GGUF introduces the ability to automatically set prompt formats, reducing the need for manual adjustments.

  5. Extensibility: GGUF is designed to accommodate future updates and enhancements, ensuring long-term compatibility and adaptability.

  6. Enhanced Tokenization: GGUF features improved tokenization code, including support for special tokens, which enhances overall performance, especially for models using new special tokens and custom prompt templates.

Supported Clients and Libraries:

GGUF is supported by a variety of clients and libraries, making it accessible and versatile for different use cases:

  1. llama.cpp.
  2. text-generation-webui
  3. KoboldCpp
  4. LM Studio
  5. LoLLMS Web UI
  6. ctransformers
  7. llama-cpp-python
  8. candle
Downloads last month
120
GGUF
Model size
6.74B params
Architecture
llama

4-bit

5-bit

6-bit

8-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.