myrkur
/

paya

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

myrkur commited on May 27

Commit

0331320

•

1 Parent(s): 4ef081a

Update README.md

Files changed (1) hide show

README.md +66 -1

README.md CHANGED Viewed

@@ -9,4 +9,69 @@ pipeline_tag: text-generation
 # Paya (aya 23 8B Instruction Tuned on Farsi)
-<a href="https://ibb.co/fHmCngh"><img src="https://i.ibb.co/jD7LWNc/paya.png" alt="paya" border="0"></a>

 # Paya (aya 23 8B Instruction Tuned on Farsi)
+<a href="https://ibb.co/fHmCngh"><img src="https://i.ibb.co/jD7LWNc/paya.png" alt="paya" border="0"></a>
+Welcome to PAYA, a powerful Persian text generation model built upon the foundations of Aya 23 8B, a multilingual language model. PAYA has been fine-tuned using the supervised finetuning technique, employing the DORA method for efficient refinement on Persian datasets, particularly leveraging the [persian-alpaca-deep-clean](https://huggingface.co/datasets/myrkur/persian-alpaca-deep-clean) dataset.
+## Features
+- **Advanced Text Generation**: Generate coherent and contextually relevant Persian text with ease.
+- **Efficient Fine-Tuning**: Utilizes the DORA method for streamlined fine-tuning on Persian datasets.
+- **Optimized Tokenization**: The model's tokenizer ensures accurate representation of Persian words, enhancing the quality of generated text.
+## Usage
+You can quickly get started with PAYA using the following sample code:
+```python
+import transformers
+import torch
+model_id = "myrkur/paya"
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model_id,
+    model_kwargs={"torch_dtype": torch.bfloat16},
+    device_map="auto",
+)
+messages = [
+    {"role": "user", "content": "علم بهتر است یا ثروت؟"},
+]
+prompt = pipeline.tokenizer.apply_chat_template(
+        messages,
+        tokenize=False,
+        add_generation_prompt=True
+)
+terminators = [
+    pipeline.tokenizer.eos_token_id,
+]
+outputs = pipeline(
+    prompt,
+    max_new_tokens=512,
+    eos_token_id=terminators,
+    do_sample=True,
+    temperature=0.4,
+    top_p=0.9,
+    repetition_penalty=1.1
+)
+print(outputs[0]["generated_text"][len(prompt):])
+```
+## Why PAYA?
+PAYA stands out for its exceptional tokenization capabilities, accurately capturing the nuances of the Persian language. Additionally, its fine-tuned parameters and efficient training methodology ensure remarkable results in text generation tasks.
+## Contributions
+Contributions to PAYA are welcome! Whether it's enhancing the model's capabilities, improving its performance on specific tasks, or evaluating its performance, your contributions can help advance Persian natural language processing.
+## Contact
+For questions or further information, please contact:
+- Amir Masoud Ahmadi: [amirmasoud.ahkol@gmail.com](mailto:amirmasoud.ahkol@gmail.com)