File size: 2,030 Bytes
2149c43 01c4c52 2149c43 7944776 2149c43 7944776 2149c43 7944776 2149c43 7944776 2149c43 7944776 2149c43 7944776 2149c43 7944776 2149c43 7944776 2149c43 7944776 2149c43 7944776 2149c43 7944776 2149c43 7944776 2149c43 7944776 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
---
library_name: transformers
license: mit
datasets:
- textdetox/multilingual_toxicity_dataset
- chameleon-lizard/synthetic-multilingual-paradetox
language:
- en
- ru
- uk
- am
- de
- es
- zh
- ar
- hi
pipeline_tag: text2text-generation
---
# Model Card for Model ID
Finetune of the mt0-xl model for text toxification task.
## Model Details
### Model Description
This is a finetune of mt0-xl model for text toxification task. Can be used for synthetic data generation from non-toxic examples.
- **Developed by:** Nikita Sushko
- **Model type:** mt5-xl
- **Language(s) (NLP):** English, Russian, Ukranian, Amharic, German, Spanish, Chinese, Arabic, Hindi
- **License:** MIT
- **Finetuned from model:** mt0-xl
## Uses
This model is intended to be used for synthetic data generation from non-toxic examples.
### Direct Use
The model may be directly used for text toxification tasks.
### Out-of-Scope Use
The model may be used for generating toxic versions of sentences.
## Bias, Risks, and Limitations
Since this model generates toxic versions of sentences, it may be used to increase toxicity of generated texts.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
import transformers
checkpoint = 'chameleon-lizard/tox-mt0-xl'
tokenizer = transformers.AutoTokenizer.from_pretrained(checkpoint)
model = transformers.AutoModelForSeq2SeqLM.from_pretrained(checkpoint, torch_dtype='auto', device_map="auto")
pipe = transformers.pipeline(
"text2text-generation",
model=model,
tokenizer=tokenizer,
max_length=512,
truncation=True,
)
language = 'English'
text = "That's dissapointing."
print(pipe('Rewrite the following text in {language} the most toxic and obscene version possible: {text}')[0]['generated_text'])
# Resulting text: "That's dissapointing, you stupid ass bitch."
```
Be sure to prompt with the provided prompt format for the best performance. Failure to include target language may result in model responses be in random language. |