Implementing a tool for extracting semantic propositions from dependency trees

Organisatsiooni nimi
Natural Language Processing
Natural language is used to convey ideas, which are called semantic propositions. The content of a text or a speech transcription can be quantified by the number of propositions or ideas expressed in it. The goal of this project is to implement a tool that can extract the propositions from the natural language text based on dependency parse trees. Dependency parse trees express the syntactic and semantic structure of sentences using directed graphs. The resulting tool is potentially useful for neurolinguists for studying the clinical speech (e.g. patients with aphasia or dementia).

The project involves the following steps:
1) Using the Stanford CoreNLP java API (, which is a collection of tools and libraries that implement various natural language processing methods, implement a tool that can extract parts of dependency trees given a search pattern. It only involves using the functionality of the existing API. There is no need to write new functionality.

2) Develop a set of rules in terms of dependency tree patterns that correspond to semantic propositions. An exhaustive description of semantic propositions for English has been published ( and the goal of this part of the project is to express these descriptions in terms of dependency tree patterns.
Lõputöö kaitsmise aasta
Kairit Sirts
eesti keel, inglise keel
Nõuded kandideerijale
Familiarity with java programming language, interest in working with natural language data
Bakalaureus, Magister

Kandideerimise kontakt

Kairit Sirts