Generative AI in data quality management (multiple topics, see description)
Organisatsiooni nimi
Software Engineering and Information Systems
Kokkuvõte
In order to preserve the value of the data in their subsequent use, the prerequisite of data quality must be met. To be able to verify the quality of data, especially third-party data (data produced / collected by a source that is different from the data user), the quality of data should be verified, which is time- and efforts- consuming task. Moreover, it requires skills and knowledge to carry out even relatively simple data quality checks, that the data user may not have.
Considering the advancements in the Generative AI area, it is seen to contribute to the data quality management movement. This thesis would explore where and how GenAI can assist data quality management (within the entire DQM lifecycle or its selected phase).
This can span from (1) reviewing the current capabilities of DQM tools (systematic review (min 50 tools), testing of thereof with the further identification of scenarios and "research agenda"), (2) analysing scenarios in which GenAI can be used and evaluating it from multi-dimensional perspective, e.g., (2a) exploring data quality requirements extraction capabilities of GenAI tools (with and without predefined data quality dimensions (according to the predefined DQ dimensions classification), (2b) examination of scenarios coming from the real-world (interviews/surveys with real users/companies).
Considering the advancements in the Generative AI area, it is seen to contribute to the data quality management movement. This thesis would explore where and how GenAI can assist data quality management (within the entire DQM lifecycle or its selected phase).
This can span from (1) reviewing the current capabilities of DQM tools (systematic review (min 50 tools), testing of thereof with the further identification of scenarios and "research agenda"), (2) analysing scenarios in which GenAI can be used and evaluating it from multi-dimensional perspective, e.g., (2a) exploring data quality requirements extraction capabilities of GenAI tools (with and without predefined data quality dimensions (according to the predefined DQ dimensions classification), (2b) examination of scenarios coming from the real-world (interviews/surveys with real users/companies).
Lõputöö kaitsmise aasta
2024-2025
Juhendaja
Anastasija Nikiforova
Suhtlemiskeel(ed)
inglise keel
Nõuded kandideerijale
Tase
Bakalaureus, Magister
Märksõnad
Kandideerimise kontakt
Nimi
Anastasija Nikiforova
Tel
E-mail