Advanced Process Mining techniques in Practice (several Master projects with ProcessGold)

ProcessGold is a software supplier bringing together Process Mining and Business Intelligence, driven by highly skilled ICT entrepreneurs and backed by a wealth of experience. ProcessGold recently released a new Process Mining platform, the ProcessGold Enterprise Platform, that combines data extraction, process mining techniques, and visual analytics in order to produce dynamic visual reports which are easy to monitor and analyze for process stakeholders. These reports form the basis for deeper, fact-driven analysis and continuous process improvement projects.

In this context, ProcessGold is constantly offering graduation internships for investigating new techniques and methodologies in Process Mining and their application in a business context. A few example topics are given below – the specific graduation project and scope and will be further developed in mutual agreement.

Possible Graduation Project Topics

  • Enhancing the Inductive Miner. The inductive miner is a process mining algorithm that can inductively discover structures, such as choice or parallelism, from event logs. We would like to investigate how we can integrate the inductive miner into ProcessGold, and where, if possible, we can improve on the algorithm to suit our practical needs.
  • Conformance checking using BPMN. Business Process Model and Notation (BPMN) can be used to model desired or expected process behavior. We would like to investigate how we can import BPMN models into ProcessGold and how to integrate conformance checking into our platform using BPMN models.
  • Prediction/simulation of throughput times. We would like to investigate how we can predict the throughput times in a process based on its mined model using predictors or simulation.
  • Process mining with user access rights. In some organizations, not all analysts may be allowed to see all details of their organization’s process or in some cases even some parts of the process. We would like to investigate how we can take these access rights into account while still providing the user with a process that can lead to meaningful insights.
  • Multi-instanced processes. Many processes have hierarchical cases that split up into multiple sub-cases that merge again later. For example, the process of manufacturing a car. Typically, these sub-cases are independent of each other, which leads to parallelism in traditional process mining. We would like to investigate how to generate insights in these kinds of processes. We will look at a pathology use case where the cases consist of a hierarchy of sub-processes.
  • Social analysis. Social interaction is an important factor within processes where people are working together. A common approach to get insights is a social network. We would like to investigate other approaches to analyze the social interactions that happen within a process.
  • Process flows on maps. Some processes can be expressed as a flow, where goods, such as packages, cargo, or money, or physical objects, such as cars or vessels, flow between predefined geographic locations. We would like to investigate how we can interactively visualize these flows on a map to enable our users to explore, understand, and find anomalies in them.
  • Interactive grouping of processes. Process data often consists of multiple sub-processes or groups of cases that exhibit similar behavior. Displaying and analyzing all these cases as a single process model may be difficult and confusing. Therefore, we would like to investigate how we can let the user interactively separate the cases of these processes into meaningful groups that can be explored separately.
  • Distributed calculation of expressions. The ProcessGold platform owes much of its flexibility to an internal expression language. This expression language is used to compute a wide variety of expressions on a very large number of records. Currently, these expressions are computed on a single thread and may form a bottleneck. We would like to investigate how we can distribute the calculation of these expressions over multiple threads to increase overall performance.
  • Performance. As the scale of the data of our customers grows, the shear amount of calculations that need to be performed on the data grows as well. To increase overall performance, we would like to investigate strategies such as flattening the data model, resorting data, and decreasing indirections.
  • Evaluation of moving calculations to a SQL or MapReduce backend. As the scale of the data of our customers grows, the shear amount of calculations that need to be performed on the data grows as well. As a strategy to increase performance, we would like to investigate the possibilities and impact of offloading these calculations to a SQL or MapReduce backend.

In all projects, the intern should be able work out the problem definition (in collaboration with ProcessGold and the supervisor), come up with a conceptual solution, and where applicable realize the solution in a proof-of-concept (in collaboration with ProcessGold).

Contact

For more information, please contact Dr. Dirk Fahland

Leave a Reply