raptorkwok
/

cantonese-chinese-translation-gen1

Text2Text Generation

Model card Files Files and versions Community

raptorkwok commited on Aug 9

Commit

59818f4

•

1 Parent(s): fda4979

Update README.md

Files changed (1) hide show

README.md +10 -9

README.md CHANGED Viewed

@@ -6,14 +6,13 @@ metrics:
 model-index:
 - name: cantonese-chinese-translation-gen1
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# cantonese-chinese-translation-gen1
-This model is a fine-tuned version of [fnlp/bart-base-chinese](https://huggingface.co/fnlp/bart-base-chinese) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Loss: 1.5413
 - Bleu: 40.7808
@@ -22,18 +21,20 @@ It achieves the following results on the evaluation set:
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -62,4 +63,4 @@ The following hyperparameters were used during training:
 - Transformers 4.28.1
 - Pytorch 2.3.1+cu121
 - Datasets 2.19.1
-- Tokenizers 0.13.3

 model-index:
 - name: cantonese-chinese-translation-gen1
   results: []
+datasets:
+- raptorkwok/cantonese-chinese-dataset-gen2
 ---
+# Cantonese-Written Chinese Translation Model
+This model is a fine-tuned version of [fnlp/bart-base-chinese](https://huggingface.co/fnlp/bart-base-chinese) on [Cantonese-Written Chinese Dataset Gen2](https://huggingface.co/raptorkwok/cantonese-chinese-dataset-gen2).
 It achieves the following results on the evaluation set:
 - Loss: 1.5413
 - Bleu: 40.7808
 ## Model description
+The model is based on BART Chinese model, trained on 1M Cantonese-Written Chinese Parallel Corpus data.
 ## Intended uses & limitations
+Its intended use is to translate Cantonese sentences to Written Chinese accurately.
 ## Training and evaluation data
+Training and evaluation data is provided by the [Cantonese-Written Chinese Dataset Gen2](https://huggingface.co/raptorkwok/cantonese-chinese-dataset-gen2).
 ## Training procedure
+The training was performed using `Seq2SeqTrainer`.
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - Transformers 4.28.1
 - Pytorch 2.3.1+cu121
 - Datasets 2.19.1
+- Tokenizers 0.13.3