DeathReaper0965 commited on
Commit
27eb9d7
1 Parent(s): 786454c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -43,10 +43,11 @@ inference:
43
  ---
44
 
45
  # Flan-T5 (base-sized) Dialogue Summarization with reduced toxicity using RLAIF
46
- This model is a fine-tuned [Flan-T5 model](https://huggingface.co/google/flan-t5-base) on the [SAMSUM](https://huggingface.co/datasets/samsum) dataset.
47
- The Base Model(Flan-T5) is based on Pre-trained T5 (Raffel et al., 2020) and fine-tuned with instructions for better zero-shot and few-shot performance <br>
 
48
 
49
- Our Model is fine-tuned specifically on a single downstream task of Dialogue Summarization on the above mentioned dataset with a primary objective of reduced toxicity while generating summaries.
50
 
51
  ## Model description
52
  This Model has the same architecture and Parameters as its base model. Please refer to this [link](https://arxiv.org/abs/2210.11416) to know more about the model details.
 
43
  ---
44
 
45
  # Flan-T5 (base-sized) Dialogue Summarization with reduced toxicity using RLAIF
46
+ This model is a fine-tuned [Flan-T5 model](https://huggingface.co/google/flan-t5-base) on the [SAMSUM](https://huggingface.co/datasets/samsum) dataset. <br>
47
+ Which is further fine-tuned using Reinforcement Learning from AI Feedback(RLAIF). <br>
48
+ Anthropic's Costitutional AI [paper](https://arxiv.org/abs/2212.08073) from 2022, provides some amazing insights into how RLAIF can be leveraged. Do check out if interested!<br>
49
 
50
+ More, specifically I've fine-tuned this model on a single downstream task of Dialogue Summarization on the above mentioned dataset with a primary objective of reduced toxicity in generated summaries.
51
 
52
  ## Model description
53
  This Model has the same architecture and Parameters as its base model. Please refer to this [link](https://arxiv.org/abs/2210.11416) to know more about the model details.