Exploring SQL-based Near-realtime Conformance Checking Using Stream Processing Engines

Name
Agnes Annilo
Abstract
Conformance checking is used to validate the accurate completion of business processes against an existing process model. Effective conformance checking using event logs has been a vital strategy in ensuring the operational integrity and compliance of a business. With the development of streaming platforms, which offer near real-time data processing, non-conformance can be identified quickly and accurately. Streaming engines like Kafka or Spark are designed to process data in near real-time, in addition they are optimised for SQL. This thesis explores the possible application of conformance checking using SQL with Kafka and ksql or Spark Structured Streaming in the conformance checking of near real-time data streams. The thesis presents a structured investigation into stateful processing of events using a combination of Spark Structured Streaming and Spark SQL, alongside Kafka and ksqlDB. The challenges of handling event streams sequentially as well as implementing a stateful solution are discussed.
Graduation Thesis language
English
Graduation Thesis type
Master - Data Science
Supervisor(s)
Kristo Raun
Defence year
2024
 
PDF