Institute of Computer Science - Graduation Theses Registry

Completed theses (Submit your thesis) Graduation theses topics (Submit a thesis topic)

A Comperative Study of Database Schema on Query Performance in Data Warehousing

Name

Roland Pajuleht

Abstract

Decision-making in organizations is often hampered by difficult-to-maintain dashboards and their archaic data architecture. Effort to maintain it over the years is mitigated by the advancement and application of hardware and software which it is built upon. The modern data stack is forgiving, in terms of schema selection and data velocity. This does not mean, however, that fundamental architectural concepts that databases are built upon, should be forgotten.
This paper compares differences in query performance and execution plans between two different approaches to data modelling. Dimensional modelling, a standard procedure for building data warehouses is compared with a less standardized model that starts to emerge as a consequence when concrete data arctitectural procedures are not in place. Several analytical queries are run against a standard, normalized star schema and a table with more relaxed form, often called One Big Table. It was found out that while readability improved when constructing queries for the wide table, performance issues quickly emerged. When operating in traditional data warehouses, data engineeres must adhere established architectural practises in order to maintain an efficient database.

Graduation Thesis language

English

Graduation Thesis type

Master - Data Science

Supervisor(s)

Eduard Ševtšenko

Defence year

2024

PDF

UT Institute of Computer Science Graduation Theses Registry

A Comperative Study of Database Schema on Query Performance in Data Warehousing