Baybars commited on
Commit
4c4db00
1 Parent(s): 1c00c50

readme fixes

Browse files
Files changed (1) hide show
  1. about.md +16 -10
about.md CHANGED
@@ -1,7 +1,7 @@
1
  ## 📄 About
2
- Natural and efficient TTS in Catalan: using Matcha-TTS with the Catalan language.
3
 
4
- Here you'll be able to find all the information regarding our models Matxa 🍵 and alVoCat 🥑 , which have been trained with the use of deep learning. If you want specific information on how to train these model you can find it [here](https://huggingface.co/BSC-LT/matcha-tts-cat-multispeaker) and [here](https://huggingface.co/BSC-LT/vocos-mel-22khz-cat) respectively. The code we've used is also on Github [here](https://github.com/langtech-bsc/Matcha-TTS/tree/dev-cat).
5
 
6
  ## Table of Contents
7
  <details>
@@ -22,7 +22,7 @@ Here you'll be able to find all the information regarding our models Matxa 🍵
22
 
23
  The significance of open-source text-to-speech (TTS) technologies for minority languages cannot be overstated. These technologies democratize access to TTS solutions by providing a framework for communities to develop and adapt models according to their linguistic needs. This is why we have developed different open-source TTS solutions in Catalan, using an ensemble of technologies.
24
 
25
- Firstly, we created a [TTS model for central Catalan](https://huggingface.co/BSC-LT/matcha-tts-cat-multispeaker) by fine-tuning the Matcha-TTS English model. Matcha-TTS is a state-of-the-art model that employs deep learning, a form of AI, to train models that replicate human speech patterns, allowing it to generate lifelike synthetic voices from written text. After that, we fine-tuned this Catalan central model for three other Catalan dialects:
26
 
27
  * Balear
28
  * North-Occidental
@@ -221,15 +221,15 @@ This version is tailored for the Catalan language, as it was trained only on Cat
221
 
222
  ## Adaptation to Catalan
223
 
224
- The original Matcha-TTS model excels in English, but to bring its capabilities to Catalan, a multi-step process was undertaken. Firstly, we fine-tuned the model from English to Catalan central, which laid the groundwork for understanding the language's nuances. This first fine-tuning was done using two datasets:
225
 
226
  * [Our version of the openslr-slr69 dataset.](https://huggingface.co/datasets/projecte-aina/openslr-slr69-ca-trimmed-denoised)
227
-
228
- * A studio-recorded dataset of central catalan, which will soon be published.
229
-
230
  * [Our version of the Festcat dataset.](https://huggingface.co/datasets/projecte-aina/festcat_trimmed_denoised)
 
 
231
 
232
- This soon to be published dataset also included recordings of three different dialects:
233
 
234
  * Valencian
235
 
@@ -275,13 +275,19 @@ If this code contributes to your research, please cite the work:
275
  The Language Technologies Unit from Barcelona Supercomputing Center.
276
 
277
  ### Contact
278
- For further information, please send an email to <langtech@bsc.es>.
279
 
280
  ### Copyright
281
  Copyright(c) 2023 by Language Technologies Unit, Barcelona Supercomputing Center.
282
 
283
  ### License
284
- [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
 
 
 
285
 
286
  ### Funding
287
  This work has been promoted and financed by the Generalitat de Catalunya through the [Aina project](https://projecteaina.cat/).
 
 
 
 
1
  ## 📄 About
2
+ Natural and efficient TTS in Catalan: 🍵+🥑 .
3
 
4
+ Here you'll be able to find all the information regarding our models 🍵 Matxa and 🥑 alVoCat, which have been trained with the use of deep learning. If you want specific information on how to train these model you can find it [here](https://huggingface.co/BSC-LT/matcha-tts-cat-multiaccent) and [here](https://huggingface.co/BSC-LT/vocos-mel-22khz-cat) respectively. The code we've used is also on Github [here](https://github.com/langtech-bsc/Matcha-TTS/tree/dev-cat).
5
 
6
  ## Table of Contents
7
  <details>
 
22
 
23
  The significance of open-source text-to-speech (TTS) technologies for minority languages cannot be overstated. These technologies democratize access to TTS solutions by providing a framework for communities to develop and adapt models according to their linguistic needs. This is why we have developed different open-source TTS solutions in Catalan, using an ensemble of technologies.
24
 
25
+ Firstly, we created a [TTS model for central Catalan](https://huggingface.co/BSC-LT/matcha-tts-cat-multispeaker) by fine-tuning the Matcha-TTS English model. Matcha-TTS is a state-of-the-art model that employs deep learning, a form of AI, to train models that replicate human speech patterns, allowing it to generate lifelike synthetic voices from written text. After that, we fine-tuned this Catalan central model for four Catalan dialects, central plus three more:
26
 
27
  * Balear
28
  * North-Occidental
 
221
 
222
  ## Adaptation to Catalan
223
 
224
+ The original Matcha-TTS model excels in English, but to bring its capabilities to Catalan, a multi-step process was undertaken. Firstly, we fine-tuned the model from English to Catalan central (Matxa-base), which laid the groundwork for understanding the language's nuances. This first fine-tuning from English was done using two datasets:
225
 
226
  * [Our version of the openslr-slr69 dataset.](https://huggingface.co/datasets/projecte-aina/openslr-slr69-ca-trimmed-denoised)
227
+
 
 
228
  * [Our version of the Festcat dataset.](https://huggingface.co/datasets/projecte-aina/festcat_trimmed_denoised)
229
+
230
+ Then we further fine-tuned the single accent Catalan Matxa-based model with the soon to be published LaFrescat dataset that has 8.5 hours of recordings for four dialectal variants:
231
 
232
+ * Central
233
 
234
  * Valencian
235
 
 
275
  The Language Technologies Unit from Barcelona Supercomputing Center.
276
 
277
  ### Contact
278
+ For further information, please email <langtech@bsc.es>.
279
 
280
  ### Copyright
281
  Copyright(c) 2023 by Language Technologies Unit, Barcelona Supercomputing Center.
282
 
283
  ### License
284
+ The demo page and the inference scripts are under [GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.en.html)
285
+
286
+ The model weights are licensed under [Creative Commons Attribution Non-commercial 4.0](https://www.creativecommons.org/licenses/by-nc/4.0/). These models are free to use for non-commercial and research purposes. Commercial use is only possible through licensing by
287
+ the voice artists. For further information, contact <langtech@bsc.es> and <lafrescaproduccions@gmail.com>. For more information see the [model page](https://huggingface.co/BSC-LT/matcha-tts-cat-multiaccent/).
288
 
289
  ### Funding
290
  This work has been promoted and financed by the Generalitat de Catalunya through the [Aina project](https://projecteaina.cat/).
291
+
292
+ Part of the training of the model was possible thanks to the compute time given by Galician Supercomputing Center CESGA
293
+ ([Centro de Supercomputación de Galicia](https://www.cesga.es/)), and also by [Barcelona Supercomputing Center](https://www.bsc.es/) in MareNostrum 5.