arvutiteaduse instituudi lõputööde teemade register


Learning the vector space of morphological transformations
Organisatsiooni nimiNatural Language Processing
KokkuvõteThe ability to generate morphologically inflected (for nouns) or conjugated (for verbs) word forms is important for many natural language processing systems, especially in morphologically complex languages such as Estonian.

The goal of this project is to machine learn the vector space of morphological transformations using a well-known TransE (https://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data.pdf) model used in relation prediction task. The relation prediction systems learn from fact triples (head entity, relation, tail entity) by projecting all entities and relations into a low-dimensional dense vector space. In this project the same method will be applied to morphological triples (baseform, morphological transformation, inflected/conjugated form).

The project involves two steps:
1) Prepare the training and test data from morphologically annotated Multext-East corpora.
2) Conduct experiments with the TransE model (C++ code available).
Lõputöö kaitsmise aasta2016-2017
JuhendajaKairit Sirts
Suhtlemiskeel(ed)eesti keel, inglise keel
Nõuded kandideerijaleFamiliarity with C++, interest in working with natural language data
Tase Bakalaureus, Magister
Märksõnad
Kandideerimise kontakt
Nimi Kairit Sirts
Tel
E-mail kairit.sirts@ut.ee


ati.study@lists.ut.ee