Institute of Computer Science - Graduation Theses Registry

Completed theses (Submit your thesis) Graduation theses topics (Submit a thesis topic)

Continuous Learning for Multilingual Neural Machine Translation

Name

Dmytro Kolesnykov

Abstract

With the growing amount of text data, there is also a growing demand for automatic translation systems. The majority of big companies are trying to develop their own translation engines to compete in this field. Especially, there is a need for universal multilingual models that ideally are capable of translating between any languages. This work aims to establish a decent multilingual translation system that continues learning from the monolingual inputs of in-domain data. Thus, to improve the multilingual NMT translation system's performance and transfer knowledge to unseen language pairs without any additional models or parallel data sources. We describe our adaptation of back-translation, a practical approach for data-augmentation, to continuous learning. The results are reported for English, Russian and Estonian languages using only publicly available data.

Graduation Thesis language

English

Graduation Thesis type

Master - Computer Science

Supervisor(s)

Andre Tättar, Mark Fišel

Defence year

2021

PDF

UT Institute of Computer Science Graduation Theses Registry

Continuous Learning for Multilingual Neural Machine Translation