Zaman, R., Hassani, M., & van Dongen, B. F. (2023). Conformance checking of process event streams with constraints on data retention. Information Systems, 117, Article 102228. https://doi.org/10.1016/j.is.2023.102228
Abstract
Conformance checking (CC) techniques in process mining determine the conformity of cases, by means of their event sequences, with respect to a business process model. Online conformance checking (OCC) techniques perform such analysis for cases in event streams. Cases in streams may essentially not be concluded. Therefore, OCC techniques usually neglect the memory limitation and store all the observed cases whether seemingly concluded or unconcluded. Such indefinite storage of cases is inconsistent with the spirit of privacy regulations, such as GDPR, which advocate the retention of minimal data for a definite period of time. Catering to the aforementioned constraints, we propose two classes of novel approaches that partially or fully forget cases but can still properly estimate the conformance of their future events. All our proposed approaches bound the number of cases in memory and forget those in excess of the defined limit on the basis of prudent forgetting criteria. One class of these proposed approaches retains a meaningful summary of the forgotten events in order to resume the CC of their cases in the future, while the other class leverages classification for this purpose. We highlight the effectiveness of all our proposed approaches compared to a state of the art OCC technique lacking any forgetting mechanism through experiments using real-life as well as synthetic event data under a streaming setting. Our approaches substantially reduce the amount of data required to be retained while minimally impacting the accuracy of the conformance statistics.