fdelucaf commited on
Commit
866d69b
1 Parent(s): 2c50863

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -74,7 +74,7 @@ The Galician-Catalan data collected from the web was a combination of the follow
74
  |Memories Projectes Lliures | 794.631 |
75
  | **Total** | **4.92.275** |
76
 
77
- The datasets were concatentated before filtering to avoid intra-dataset duplicates and the final size was 4.267.995.
78
  The 5.750.000 sentence pairs of synthetic parallel data were created from a random sampling of the [Projecte Aina ES-CA corpus](https://huggingface.co/projecte-aina/mt-aina-ca-es)
79
 
80
  ### Training procedure
 
74
  |Memories Projectes Lliures | 794.631 |
75
  | **Total** | **4.92.275** |
76
 
77
+ The datasets were concatenated before filtering to avoid intra-dataset duplicates and the final size was 4.267.995.
78
  The 5.750.000 sentence pairs of synthetic parallel data were created from a random sampling of the [Projecte Aina ES-CA corpus](https://huggingface.co/projecte-aina/mt-aina-ca-es)
79
 
80
  ### Training procedure