Continuous Learning for Multilingual Neural Machine Translation

Name
Dmytro Kolesnykov
Abstract
With the growing amount of text data, there is also a growing demand for automatic translation systems. The majority of big companies are trying to develop their own translation engines to compete in this field. Especially, there is a need for universal multilingual models that ideally are capable of translating between any languages. This work aims to establish a decent multilingual translation system that continues learning from the monolingual inputs of in-domain data. Thus, to improve the multilingual NMT translation system's performance and transfer knowledge to unseen language pairs without any additional models or parallel data sources. We describe our adaptation of back-translation, a practical approach for data-augmentation, to continuous learning. The results are reported for English, Russian and Estonian languages using only publicly available data.
Graduation Thesis language
English
Graduation Thesis type
Master - Computer Science
Supervisor(s)
Andre Tättar, Mark Fišel
Defence year
2021
 
PDF