--- license: apache-2.0 library_name: JoeyNMT task: Machine-translation tags: - JoeyNMT - Machine-translation language: - en - de - fr - multilingual datasets: - may-ohta/iwslt14 metrics: - bleu --- # JoeyNMT: iwslt14 de-en-fr multilingual This is a JoeyNMT model for multilingual MT with language tags, built for a demo purpose. The model is trained on iwslt14 de-en / en-fr parallel data using DDP. Install [JoeyNMT](https://github.com/joeynmt/joeynmt) v2.3: ``` $ pip install git+https://github.com/joeynmt/joeynmt.git ``` ## Translation Torch hub interface: ```python import torch iwslt14 = torch.hub.load("joeynmt/joeynmt", "iwslt14_prompt") translation = iwslt14.translate( src=["Hello world!"], # src sentence src_prompt=[""], # src language code trg_prompt=[""], # trg language code beam_size=1, ) print(translation) # ["Hallo Welt!"] ``` (See [jupyter notebook](https://github.com/joeynmt/joeynmt/blob/main/notebooks/torchhub.ipynb) for details) ## Training ``` $ python -m joeynmt train iwslt14_prompt/config.yaml --use-ddp --skip-test ``` (See `train.log` for details) ## Evaluation ``` $ git clone https://huggingface.co/may-ohta/iwslt14_prompt $ python -m joeynmt test iwslt14_prompt/config.yaml --output-path iwslt14_prompt/hyp ``` direction | bleu --------- | :---- en->de | 28.88 de->en | 35.28 en->fr | 38.86 fr->en | 40.35 - beam_size: 5 - beam_alpha: 1.0 - sacrebleu signature `nrefs:1|case:lc|eff:no|tok:13a|smooth:exp|version:2.4.0` (See `test.log` for details) ## Data Format We downloaded IWSLT14 de-en and en-fr from [https://wit3.fbk.eu/2014-01](https://wit3.fbk.eu/2014-01) and created `{train|dev|test}.tsv` files in the following format: |src_prompt|src|trg_prompt|trg| |:---------|:--|:---------|:--| |``|Hello.|``|Hallo.| |``|Vielen Dank!|``|Thank you!| (See `test.ref.de-en.tsv`)