Generative AI in data quality management (multiple topics, see description)

Organization
Software Engineering and Information Systems
Abstract
In order to preserve the value of the data in their subsequent use, the prerequisite of data quality must be met. To be able to verify the quality of data, especially third-party data (data produced / collected by a source that is different from the data user), the quality of data should be verified, which is time- and efforts- consuming task. Moreover, it requires skills and knowledge to carry out even relatively simple data quality checks, that the data user may not have.
Considering the advancements in the Generative AI area, it is seen to contribute to the data quality management movement. This thesis would explore where and how GenAI can assist data quality management (within the entire DQM lifecycle or its selected phase).
This can span from (1) reviewing the current capabilities of DQM tools (systematic review (min 50 tools), testing of thereof with the further identification of scenarios and "research agenda"), (2) analysing scenarios in which GenAI can be used and evaluating it from multi-dimensional perspective, e.g., (2a) exploring data quality requirements extraction capabilities of GenAI tools (with and without predefined data quality dimensions (according to the predefined DQ dimensions classification), (2b) examination of scenarios coming from the real-world (interviews/surveys with real users/companies).
Graduation Theses defence year
2024-2025
Supervisor
Anastasija Nikiforova
Spoken language (s)
English
Requirements for candidates
Level
Bachelor, Masters
Keywords
#SEIS, #dataquality, #dataqualitymanagement, #metadata, #machinelearning, #ai, #artificialintelligence, #generativeai #genai, #llm, #largelanguagemodel

Application of contact

 
Name
Anastasija Nikiforova
Phone
E-mail
anastasija.nikiforova@ut.ee