Efficient Algorithms for Discovering Concept Drift in Business Processes.

Name
Jevgeni Martjušev
Abstract
Process mining is a relatively new research area, but it is already used in practice. Every company and organization run different business processes, which are supported by information systems and which leave event logs while being executed. By analyzing those logs one can build a process model, which reflects how the process operates in reality.Existing algorithms assume that the analyzed process is in steady state, however it could be altered because of seasonality, a new law or some event, like a financial crisis. In this case, we have to deal with concept drift. Concept drifts can be sudden, when the change is abrupt and gradual, where one concept fades gradually while the other takes over. In this work we proposed five novel approaches for detecting concept drifts in process mining. All of them improve or expand the algorithm, proposed by Bose et al [1]. Step size improvement allows to speed up the algorithm by leaving out some intermediate steps. Automatic change point detection algorithm allows to extract the concept drift points without the need to analyze the plot manually. The adaptive windows algorithm (ADWIN) relaxes the original algorithm's dependency on the fixed population size, thus reducing the amount of false positives and false negatives. The algorithm with non-continuous populations allows to deal with gradual drifts. And finally, defining the population sizes in terms of time periods instead of trace amount allows to detect micro-level and macro-level drifts in logs with multi-order dynamics, where process changes can happen on multiple level of granularity. The algorithms were implemented in the Concept Drift plug-in of ProM framework. For assessing the quality of algorithms, we proposed a way to generate logs with different concept drift characteristics using CPN Tools and a quality evaluation framework, similar to the one used in the field information retrieval, involving calculating true positives, false positives, false negative and derived metrics. The algotihms were successfully tested on both simulated and real-life data. [1] Bose, R.P.J.C., van der Aalst, W.M.P., Žliobaitė, I., Pechenizkiy, M.: Handling Concept Drift in Process Mining. In: CAiSE. LNCS, vol. 6741, pp. 391–405.Springer, Berlin (2011)
Graduation Thesis language
English
Graduation Thesis type
Master - Information Technology
Supervisor(s)
R.P. Jagadeesh Chandra Bose, Fabrizio Maria Maggi
Defence year
2013
 
PDF Extras