Programme for Automatic Generation and Verification of Computer Dictionary Definitions Developed on the Example of Estonian Wordnet

Name
Kristo Markov
Abstract
The purpose of this project was to automatically generate definitions for the terms in Estonian Wordnet that do not have existing definitions. The history of computer lexicons, as well as descriptions of wordnet lexicons and the various methods by which they are created, are presented in the theoretical section of the thesis. Further details on the creation, content and features of Estonian Wordnet are outlined. Additionally, guidance on the formalisation of the definitions is provided.

As a result of the project, a program was developed that produces definitions for terms that lack definitions in Estonian Wordnet. This was done using four different methods. It was concluded that definitions could be generated, but no method provided 100% correct definitions. Hence, all the generated definitions had to be verified. A total of 11,075 definitions were generated for the 18,731 missing definitions. The highest number of definitions (5469 definitions or ~50% of definitions) was generated based on the unique synset member method. The methods of similarity and the unique synset member worked with the best accuracy - 91% and 84% of the definitions were defined as fitting, respectively.
Graduation Thesis language
Estonian
Graduation Thesis type
Bachelor - Computer Science
Supervisor(s)
Heili Orav, Indrek Jentson
Defence year
2022
 
PDF