Grammatical Error Correction via Multilinual Neural Machine Translation
Name
Agnes Luhtaru
Abstract
We introduce an approach to grammatical error correction that does not require annotated training data. We train a multilingual neural machine translation model that uses only language-parallel translations. There are more openly available translations available than grammatical error correction corpora, especially for low-resource languages like Estonian. We find out that this system has high recall but low precision. So it corrects plenty of mistakes but adds many mistakes to correct text. Adding artificial mistakes increases the recall and has really positive impact on spelling error correction. Our model reliably corrects grammatical errors, like subject-verb agreement and noun number, but struggles with lexical errors and unnecessary paraphrasing.
Graduation Thesis language
Estonian
Graduation Thesis type
Bachelor - Computer Science
Supervisor(s)
Mark FiĊĦel
Defence year
2020