Character Mapping and Ad-hoc Adaptation: Edinburgh’s IWSLT 2020 Open Domain Translation System

Pinzhen Chen, Nikolay Bogoychev, Ulrich Germann


Abstract
This paper describes the University of Edinburgh’s neural machine translation systems submitted to the IWSLT 2020 open domain JapaneseChinese translation task. On top of commonplace techniques like tokenisation and corpus cleaning, we explore character mapping and unsupervised decoding-time adaptation. Our techniques focus on leveraging the provided data, and we show the positive impact of each technique through the gradual improvement of BLEU.
Anthology ID:
2020.iwslt-1.14
Volume:
Proceedings of the 17th International Conference on Spoken Language Translation
Month:
July
Year:
2020
Address:
Online
Editors:
Marcello Federico, Alex Waibel, Kevin Knight, Satoshi Nakamura, Hermann Ney, Jan Niehues, Sebastian Stüker, Dekai Wu, Joseph Mariani, Francois Yvon
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
122–129
Language:
URL:
https://aclanthology.org/2020.iwslt-1.14
DOI:
10.18653/v1/2020.iwslt-1.14
Bibkey:
Cite (ACL):
Pinzhen Chen, Nikolay Bogoychev, and Ulrich Germann. 2020. Character Mapping and Ad-hoc Adaptation: Edinburgh’s IWSLT 2020 Open Domain Translation System. In Proceedings of the 17th International Conference on Spoken Language Translation, pages 122–129, Online. Association for Computational Linguistics.
Cite (Informal):
Character Mapping and Ad-hoc Adaptation: Edinburgh’s IWSLT 2020 Open Domain Translation System (Chen et al., IWSLT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.iwslt-1.14.pdf
Video:
 http://slideslive.com/38929590
Code
 marian-nmt/marian