Fani Sani, M., van Zelst, S. J., & van der Aalst, W. M. P. (2018). Repairing outlier behaviour in event logs. In W. Abramowicz, & A. Paschke (Eds.), Business Information Systems – 21st International Conference, BIS 2018, Proceedings (pp. 115-131). (Lecture Notes in Business Information Processing; Vol. 320). Cham: Springer. https://doi.org/10.1007/978-3-319-93931-5_9
One of the main challenges in applying process mining on real event data, is the presence of noise and rare behaviour. Applying process mining algorithms directly on raw event data typically results in complex, incomprehensible, and, in some cases, even inaccurate analyses. As a result, correct and/or important behaviour may be concealed. In this paper, we propose an event data repair method, that tries to detect and repair outlier behaviour within the given event data. We propose a probabilistic method that is based on the occurrence frequency of activities in specific contexts. Our approach allows for removal of infrequent behaviour, which enables us to obtain a more global view of the process. The proposed method has been implemented in both the ProM- and the RapidProM framework. Using these implementations, we conduct a collection of experiments that show that we are able to detect and modify most types of outlier behaviour in the event data. Our evaluation clearly demonstrates that we are able to help to improve process mining discovery results by repairing event logs upfront.