Understanding and predicting behavior of people and machines in a shared setting (task, project, factory, process, organization) is central to Data Science and Artificial Intelligence. Actions of people and machines can be recorded as discrete events in event sequences (logs), event databases (tables, graphs), and real-time event streams. Learning behavioral models of discrete event data of human behavior is challenging. Only those events which are causally related may be analyzed together. Further, the analysis results must be fully explainable and interpretable by humans, to evaluate, understand, communicate and improve the model – to let users take correct decisions in concrete situations.
This advanced course on process mining teaches students the fundamental concepts and theoretical foundations of process mining along a complete process mining methodology, and exposes students to real-life data sets to understand challenges related to process discovery, conformance checking, and model extension. The course covers the following topics
- Advanced data formats and fundamental filtering and pre-processing operations of discrete event data. This includes understanding and pre-processing unstructured event data, understanding and querying multi-dimensional event data (event graph databases), and principles of extracting features from large real-time event streams (windowing and ageing concepts, stream clustering)
- How to design model learning algorithms (process discovery algorithms) to discovery explainable models with quality guarantees from discrete event data for unstructured behavior, structured behavior, and real-time event streams. This specifically addressed identifying causality in event data using behavioral pattern and constraint mining from unstructured sequences, using graph structure analysis from aggregations of event data (with and without recursion) and from prefix-trees of event streams.
- How to critically evaluate behavioral models (both explainable models and block-box machine learning and prediction models) regarding their behavioral accuracy and their explainability (generalizability and understandability) using detailed diagnostic information down to the level of individual events. We specifically address techniques for unstructured behavior, structured behavior, and event streams and how to use the diagnostic information to improve model quality and draw actionable conclusions.
All concepts will be discussed and illustrated on concrete cases and event datasets from a variety of domains, including hospitals, high-tech systems, logistics systems, insurance companies, governments, etc.
The course is taught as a flipped classroom course with a group-assignment:
- Each topic has a concrete, hands-on reading assignment on the social reading platform www.perusall.com where you can annotate text passages and discuss with fellow students and lecturers the parts you find difficult to understand.
- We provide for each topic practical exercises: paper-based exercises on the level of the final exam, and tool-based exercises to try out and understand the techniques.
- During the online class (video meetings), we specifically address your questions about concepts and ideas you found difficult to understand from reading. We solve example assignments and explain the concepts interactively.
- In the group assignment, you analyze a real-life dataset using a structured data science analysis methodology and the techniques taught in the course to discover, evaluate, and improve explainable models and to draw actionable conclusion.
The course replaces 2IMI20 Advanced Process Mining. In comparison, it focuses more on advanced topics and ongoing research.
Students must have passed Introduction to Process Mining (2IMI35) or Foundations of Process Mining (2AMI10) to participate.
- Dirk Fahland - Dirk is Associate Professor (UHD) in the PA group. He completed his PhD with summa cum laude at Humboldt-Univeristät zu Berlin and Eindhoven University of Technology in 2010. His research interests include distributed processes and systems built from distributed components for which he investigates modeling systems (using process modeling languages, Petri nets, or scenario-based techniques), Read More ...
- Marwan Hassani - Dr. Marwan Hassani is assistant professor at the PA group with a focus on Real-Time Process Mining. His research interests include stream data mining, sequential pattern mining of multiple streams, efficient anytime clustering of big data streams and exploration of evolving graph data. He uses customer journey optimizationa and privacy-aware process mining as use cases for his Read More ...
- Renata Medeiros de Carvalho - Position: UD Room: MF 7.067 Tel (internal): 4144 Links: Courses External assignments Projects Publications External links: Scopus page DBLP page TU/e page Recent courses Recent external assignments Recent projects Recent publications