A Functional Prototype and General Architecture of Analytic Data Management for a Railway Company

Name
Mait Metelitsa
Abstract
The thesis delves into the realm of data engineering and business intelligence within the context of a railway company. The work aims to address the challenges posed by the modernisation efforts of EVR's analytic data management platform.
One of the key novel aspects of the thesis lies in its transition from the AS-IS state to the TO-BE state of analytic data management in EVR. This transition involves a detailed analysis of decision dimensions and technology alternatives for the future architecture of EVR's data management system. The proposed TO-BE architecture advocates for a flexible hybrid approach combining on-premises/Infrastructure as a Service (IaaS) and Soft-ware as a Service (SaaS) solutions tailored to the type and complexity of data sources.

Furthermore, the thesis presents practical use cases based on the TO-BE architecture, showcasing the implementation of end-to-end analyses of purchase invoices and railway level crossings' log data. By integrating technologies such as ETL processes, Dagster for data orchestration, Postgres for data storage, Streamlit for data visualisation, and XML-fetching for data retrieval, the thesis demonstrates a tangible contribution towards further improving EVR's data engineering capabilities and preparing the company for the adoption of data lakehouse SaaS platform.
Overall, the thesis contributes to the field of data management by providing a structured framework for decisionmaking in analytic data platform architectural design, emphasising the importance of technology choices and trade-offs in optimising analytic data man-agement for a complex organisation like EVR.

Given the limited exploration of data engineering and business intelligence within the railway sector, this thesis fills a significant knowledge gap by providing insights and recommendations tailored to the unique requirements and challenges of managing data in a railway company undergoing analytics platform modernisation.
Graduation Thesis language
English
Graduation Thesis type
Master - Data Science
Supervisor(s)
Kristo Raun, Ahmed Awad
Defence year
2024
 
PDF