Implementing a tool for extracting semantic propositions from dependency trees

Organization
Natural Language Processing
Abstract
Natural language is used to convey ideas, which are called semantic propositions. The content of a text or a speech transcription can be quantified by the number of propositions or ideas expressed in it. The goal of this project is to implement a tool that can extract the propositions from the natural language text based on dependency parse trees. Dependency parse trees express the syntactic and semantic structure of sentences using directed graphs. The resulting tool is potentially useful for neurolinguists for studying the clinical speech (e.g. patients with aphasia or dementia).

The project involves the following steps:
1) Using the Stanford CoreNLP java API (http://stanfordnlp.github.io/CoreNLP/), which is a collection of tools and libraries that implement various natural language processing methods, implement a tool that can extract parts of dependency trees given a search pattern. It only involves using the functionality of the existing API. There is no need to write new functionality.

2) Develop a set of rules in terms of dependency tree patterns that correspond to semantic propositions. An exhaustive description of semantic propositions for English has been published (https://www.researchgate.net/publication/267362822_Analysis_of_Idea_Density_AID_A_Manual) and the goal of this part of the project is to express these descriptions in terms of dependency tree patterns.
Graduation Theses defence year
2016-2017
Supervisor
Kairit Sirts
Spoken language (s)
Estonian, English
Requirements for candidates
Familiarity with java programming language, interest in working with natural language data
Level
Bachelor, Masters
Keywords

Application of contact

 
Name
Kairit Sirts
Phone
E-mail
kairit.sirts@ut.ee