Have you ever analyzed some event data and wondered if the steps you take impact the results you get?Or whether the result you obtain match what you expected to find?
These are common challenges faced by process mining analysts, who analyze large sets of event data to gain insights into how business processes are executed. In process mining, there are many steps involved—such as filtering events or abstracting data— and each of these steps can be performed in different ways, depending on the data and the specific goals of the analysis. The choices analysts make during these steps can have a significant impact on the results, but understanding these effects is not always straightforward.
Problem
The complexity increases in exploratory settings, where analysts adjust their methods as they learn more about the data. As the analysis progresses, it becomes difficult to track how individual decisions—such as data cleaning or abstraction—affect the results, and whether those results align with the initial goals of the analysis.
For example, imagine you are a process analyst working with an event log. You decide to merge duplicate events of the same type that occur within a single case. While this decision simplifies the dataset, it could prevent you from identifying repeating patterns or detecting loops in the process model later. What seemed like a practical decision early on now limits your ability to capture certain behaviors in the data.
This scenario highlights a key challenge: process analysts often lack support for understanding how the decisions they make at early stages of the analysis affect the final results.
This leads us to the following research question:
How can we help process analysts make their own analysis process more explainable?
Goal
The primary goal of this project is to develop methods, algorithms, and tools that assist analysts in tracing their analysis steps, linking them to the results, and understanding how each decision influences the final outcomes. By making the analysis process more explainable, analysts will be able to make more informed choices and better interpret their results.
The overarching goal of this project is to develop methods, algorithms, and tools to assist analysts in tracing their analysis steps, linking them to the results, and understanding how the decisions they make affect these results. By making the analysis process more explainable, analysts will be able to make more informed choices and better interpret their results.
This research question involves designing methods or tools that:
- Assist analysts in tracking and organizing their analysis steps and link them to the input data they used for their analysis.
- Make it easier to compare different results of their analysis and understand why they differ.
- Enable analysts to identify the key features of the event data that contributed to a certain result.
Contact
Francesca Zerbato, f.zerbato@tue.nl
Interesting Reads
Zerbato, F., Burattin, A., Völzer, H., Becker, P. N., Boscaini, E., & Weber, B. (2023, June). Supporting provenance and data awareness in exploratory process mining. In International Conference on Advanced Information Systems Engineering (pp. 454-470). Cham: Springer Nature Switzerland. https://link.springer.com/chapter/10.1007/978-3-031-70418-5_15
Zerbato, F., Franceschetti, M., & Weber, B. (2024, August). A Framework to Support the Validation of Process Mining Inquiries. In International Conference on Business Process Management (pp. 249-266). Cham: Springer Nature Switzerland. https://link.springer.com/chapter/10.1007/978-3-031-34560-9_27
Klinkmüller, C., Seeliger, A., Müller, R., Pufahl, L., & Weber, I. (2021, August). A method for debugging process discovery pipelines to analyze the consistency of model properties. In International Conference on Business Process Management (pp. 65-84). Cham: Springer International Publishing. https://link.springer.com/chapter/10.1007/978-3-030-85469-0_7
Sellam, T., & Kersten, M. (2016). Ziggy: Characterizing query results for data explorers. Proceedings of the VLDB Endowment, 9(13), 1473-1476. https://dl.acm.org/doi/10.14778/3007263.3007287