Modular Septilingual Neural Machine Translation

Name
Taido Purason
Abstract
Currently, the majority of state-of-the-art multilingual neural machine translation systems use a single universal model which fully shares parameters between all language pairs. The University of Tartu Neural Machine Translation system uses the universal architecture as well, and thus also suffers from the problems associated with it, such as limited capacity per language pair. Previous research has shown that a modularized approach with language-specific encoders and decoders successfully addresses many of the universal model's shortcomings. This thesis applies the modularized architecture and improves the University of Tartu translation system. Orders of magnitude larger dataset containing 7 languages is used to train the models compared to previous work. The modularized model achieves significantly higher BLEU scores than the University of Tartu model and the baseline universal model on all language pairs.
Graduation Thesis language
English
Graduation Thesis type
Bachelor - Computer Science
Supervisor(s)
Andre Tättar, Elizaveta Korotkova
Defence year
2021
 
PDF