Scaling Out the Discovery of Business Process Simulation Models from Event Logs

Ihar Suvorau
Background. The automated discovery of business process simulation (BPS) models has received considerable attention in the process mining community in the past decade. The main open question in this field is how to make such discovery accurate, fast and efficient to provide more value for the end-users.
Aim. This thesis aims at re-architecting an existing tool for automated BPS model discovery, namely Simod, to manage varying workloads in a~scalable and robust manner.
Methods. Scalability and robustness are achieved through building a~distributed event-based system using the integration with the Kubernetes API. An efficiency metric has been used to evaluate the scalability of the final solution. A robustness-under-load experiment shows that the re-architected system remains available under high demand.
Results. The results of the validation experiments showed the system is scalable for small-sized event logs and robust under high load. A limitation of the study is that the testing environment, based on kind-clusters of 1, 2, 3, and 4 worker nodes, is not suitable for large-scale load testing experiments.
Conclusion. This thesis provides a framework for implementing scalable, robust, and resilient workflows on Kubernetes for BPS model discovery that can benefit the process mining community. Further work is needed to improve the Simod architecture by splitting it into smaller independent components to achieve higher scalability and resource utilisation.
Graduation Thesis language
Graduation Thesis type
Master - Software Engineering
Marlon Dumas, David Chapela de la Campa
Defence year