Query answering under uncertainty

Organisatsiooni nimi
Chair of Data Science
Kokkuvõte
Uncertainty is an inherent aspect of data. In databases, uncertainty shows up as violations of integrity constraints. For instance, a primary key EmployeeID might fail when integrating data from different company registers. While these violations can be resolved by deleting data tuples, this may lead to a loss of valuable information.

A database is called inconsistent if it does not satisfy all of its integrity constraints, and otherwise consistent. The paradigm of consistent query answering addresses the challenge of handling inconsistent databases without discarding data [1,2]. We explore the following question: what percentage of repairs evaluates a Boolean query Q as true? A repair R of an inconsistent database D is defined as any consistent database that remains as close as possible to D. This question is generally computationally hard to solve, unless it is possible to assume that the underlying graph of D (and Q) has a tree-like structure, meaning it has a small treewidth [3]. In this thesis the objective is to algorithmically inspect the treewidth of artificial and/or real-life inconsistent databases. The thesis is tied to ongoing research and may lead to joint authorship in a research publication.

[1]  Marcelo Arenas, Leopoldo E. Bertossi, and Jan Chomicki. Consistent query answers in inconsistent databases. In PODS, pages 68–79. ACM Press, 1999.
[2]  Leopoldo E. Bertossi. Database Repairing and Consistent Query Answering. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2011
[3] https://en.wikipedia.org/wiki/Treewidth
Lõputöö kaitsmise aasta
2024-2025
Juhendaja
Miika Hannula
Suhtlemiskeel(ed)
inglise keel
Nõuded kandideerijale
Tase
Magister
Märksõnad
#query, #database

Kandideerimise kontakt

 
Nimi
Miika Hannula
Tel
E-mail
hannula@ut.ee