Assignments

The PA group is conducting research, development, and experiments in the area of Process Analytics and Process Mining. You can join these endeavors by developing your own Process Analytics or Process Mining solution. We offer challenging assignments on industrially relevant topics, personal guidance developing your own cutting-edge solution in concept and in realization, and evaluation and experiments on real-life data.

The following assignments are a selection of possible topics and available at all times. If you have a question about a particular topic, please contact the lecturer offering the topic. If you have ideas for your own topic, please contact Dr. Dirk Fahland.

Interpreting workflow deviations for real-life case studies

Posted onOctober 2, 2023AuthorEric VerbeekLeave a comment

Process models are used to describe and reason about the execution of a process (e.g., package delivery) where a process instance (package), also called as a case, moves through the system. A case in a process is often subject to interaction with other cases and/or resources (e.g. deliverer), impacting the workflow. Event logs record which Read More …

Privacy Guarantees in Process Discovery

Posted onSeptember 29, 2022September 29, 2022AuthorEric VerbeekLeave a comment

Responsible Process Mining as a topic is introduced in [1]: “The prospect of data misuses negatively affecting our life has led to the concept of responsible data science. It advocates for responsibility to be built, by design, into data management, data analysis, and algorithmic decision-making techniques such that it is made difficult or even impossible Read More …

Closing the process mining circle with CPN IDE

Posted onSeptember 3, 2021AuthorEric VerbeekLeave a comment

CPN Tools is a tool that is well-known in the Petri net community. CPN Tools provides a mature environment for constructing, simulating, and performing analysis of CPN (Coloured Petri Net) models. CPN Tools consists of an ML-based CPN simulator (the back-end), and a CPN editor (the front-end) that has been developed in the BETA programming Read More …

Online Spatial Prediction Model for Citizens’ Public Space Complaints in Eindhoven

Posted onOctober 13, 2020October 13, 2020AuthorMarwan HassaniLeave a comment

Smart cities approach does not only emphasize the implementations of new technologies in a city but also highlights the importance of using new technologies for enabling citizens’ engagement in urban planning processes. In that regard, ICTs play a vital role in (i) supporting citizens to report their complaints related to the public spaces (i.e. Read More …

Develop a Behavioral Event Data Query Language

Posted onOctober 12, 2020October 12, 2020AuthorEric VerbeekLeave a comment

Query languages are essential for exploring, working with data and directly answering questions from data. SQL is the prime example for answering questions on relational data. Behavioral data is recorded in the form of events with timestamps. Various techniques such as Process Mining use the data in the form of event logs to aggregate and Read More …

Log-based vs. Model-based Concept Drift Detection

Posted onDecember 26, 2019February 4, 2020AuthorMarwan HassaniLeave a comment

StrProMCDD is a recently published work that detects concept drifts in event streams (see the figure below). StrProMCDD uses several model-based distance measures to detect these deviations using an adaptive window concept. In this assignment, we would like to compare the performance of this model-based approach with log-based stream clustering approaches that try to detect drifts in Read More …

Process Discovery using Generative Adversarial Neural Networks

Posted onNovember 4, 2019February 4, 2020AuthorEric VerbeekLeave a comment

Process Discovery is an unsupervised learning problem with the task of discovering a graph-based model from sequences (or graphs) of event data that describes the data best. Generative Adversarial Neural Networks (GANNs) are a type of neural networks used to learn structures in an unsupervised fashion. The objective of this project is to explore the Read More …

Process Mining on Event Graph Databases (multiple projects)

Posted onNovember 4, 2019February 4, 2020AuthorEric VerbeekLeave a comment

Process mining assumes event data to be stored in an event log, which is technically either a relational table (attributes as columns) or a stream of events (attribute value pairs). Recently, we developed a new technique to store event data in a Graph database such as Neo4j. This allows to do process mining over various Read More …

Process Mining with Textual Data

Posted onNovember 4, 2019February 4, 2020AuthorEric VerbeekLeave a comment

In many application domains, a process execution is captured using natural language. Think of medical records, customer complaints, legal records… The same holds for process models: they can be captured as text for medical guidelines, user manuals, legal regulations are typical examples of such cases. Such data forms a new challenge for the process mining Read More …

Real-Time Process Mining for Customer Journey Data

Posted onDecember 12, 2019February 4, 2020AuthorEric VerbeekLeave a comment

Available process discovery have been tested in the customer journey context under offline settings. Recent online process discovery approaches like: https://ieeexplore.ieee.org/document/7376771 bring however a lot of added value for a real-time customer journey optimization. The objective of this assignment is to use two different customer journey datasets to test the effectiveness of such approaches for Read More …

Finding Patterns in Evolving Graphs

Posted onDecember 12, 2019February 4, 2020AuthorEric VerbeekLeave a comment

The analysis of the temporal evolution of dynamic graphs like social networks is a key challenge for understanding complex processes hidden in graph structured data. Graph evolution rules capture such processes on the level of small subgraphs by describing frequently occurring structural changes within a network. Existing rule discovery methods make restrictive assumptions on the Read More …

Using Sequential Pattern Mining to Detect Drifts in Streaming Data

Posted onDecember 12, 2019December 13, 2019AuthorEric VerbeekLeave a comment

BFSPMiner is an effective and efficient batch-free algorithm for mining sequential patterns over data streams was published very recently https://link.springer.com/article/10.1007/s41060-017-0084-8. An implementation of the algorithm is available here: https://github.com/Xsea/BFSPMiner. As BFSPMiner has proven to be effective (see Figures 10-14 of the paper) in different domains (see Table 1 in the paper), we would like to Read More …

Efficient unsupervised event context detection

Posted onNovember 4, 2019February 4, 2020AuthorEric VerbeekLeave a comment

for event log clustering, outlier detection, and pre-processing. We recently developed a technique to detect the context of events from an event log in an efficient way through sub-graph matching. This allows to identify events and parts of event logs which are similar or different to each other, allowing to cluster traces, detect outliers, and Read More …

Smart event log pre-processing

Posted onNovember 4, 2019February 4, 2020AuthorEric VerbeekLeave a comment

The quality of process mining results highly depends on the quality of the input data where noise, infrequent behaviors, log incompleteness or many different variants undercut the assumptions of process discovery algorithms, and lead to low-quality results. ProM provides numerous event log pre-processing and filtering options, but they require expert knowledge to understand when which Read More …

Log Data Anonymization

Posted onNovember 4, 2019February 4, 2020AuthorEric VerbeekLeave a comment

In the context of process mining, we are often confronted with companies willing to share their data if we can sufficiently anonymize this. However, to date, there are no well-defined plugins to do such anonymizations. Therefore, we are looking for a Master student that is willing to help us with this. Part of the project Read More …

Adding heuristics to the Block Layout

Posted onNovember 4, 2019February 4, 2020AuthorEric VerbeekLeave a comment

The Block Layout can be used to create a layout for a process graph. For this, it uses well-known Petri-net-based reduction rules to reduce the entire net into a single place. For nicely structured process graphs, this layout works quite well, but for more complex structured graphs, the resulting layout needs to be improved. Either Read More …

N-out-of-M patterns in alignments

Posted onNovember 4, 2019February 4, 2020AuthorEric VerbeekLeave a comment

Aligning structured process models to event logs is a far from trivial task. In complex modelling languages, inclusive OR-split/join patterns play an important role and they are known to be notoriously difficult to align to event logs due to their large state-spaces. The known Petri net translations of OR-joins rely either on token coloring or Read More …

Generating non block-structured models and corresponding logs

Posted onNovember 4, 2019February 4, 2020AuthorEric VerbeekLeave a comment

For experimenting with process discovery and Petri nets, scientists often rely on experiments with artificial models and logs. More often than not, these models are block structured as it is easy to generate such models by simply building a random process tree and translating that into a Petri net. However, Petri nets allow for more Read More …

Petri net reduction rules for replay

Posted onNovember 4, 2019February 4, 2020AuthorEric VerbeekLeave a comment

Replaying event logs on Petri nets, either through token-replay or using alignments, is a complex task. Especially when models become larger and have more labels, the size of the models becomes a problem. In Petri net theory, many reduction rules exist for reducing Petri nets while retaining, for example, soundness of the model. Can we Read More …