Towards Phrase-based Unsupervised Machine Translation: Phrase Representations

Maksym Del
Current unsupervised machine translation models despite achieving promising results work quite modestly comparing to the supervised approaches. This work aims to make an important step towards a new research direction of Phrase-based Unsupervised Machine Translation. Since current word-based models rely on representation of words, phrasebased models require appropriate phrase representations. These representations should be learned without supervision, address phrase specific multiword expressions issues, and their embedding space has to follow certain regulations for unsupervised translation
to perform reasonable. We specify what makes phrase representations effective in terms of unsupervised machine translation, define unsupervised compositional modeling framework for phrases, and show how to use this framework to satisfy to the proposed requirements thus obtaining effective representations for phrases. We make the code and trained models publicly available as an open source project.
Graduation Thesis language
Graduation Thesis type
Master - Computer Science
Mark Fishel
Defence year