A model-based framework to automatically generate semi-real data for evaluating data analysis techniques

Li, G., de Carvalho, R. M., & van der Aalst, W. M. P. (2019). A model-based framework to automatically generate semi-real data for evaluating data analysis techniques. In J. Filipe, A. Brodsky, M. Smialek, & S. Hammoudi (Eds.), ICEIS 2019 – Proceedings of the 21st International Conference on Enterprise Information Systems (pp. 213-220). SCITEPRESS-Science and Technology Publications, Lda.. DOI: 10.5220/0007713702130220

Abstract

As data analysis techniques progress, the focus shifts from simple tabular data to more complex data at the level of business objects. Therefore, the evaluation of such data analysis techniques is far from trivial. However, due to confidentiality, most researchers are facing problems collecting available real data to evaluate their techniques. One alternative approach is to use synthetic data instead of real data, which leads to unconvincing results. In this paper, we propose a framework to automatically operate information systems (supporting operational processes) to generate semi-real data (i.e., “operations related data” exclusive of images, sound, video, etc.). This data have the same structure as the real data and are more realistic than traditional simulated data. A plugin is implemented to realize the framework for automatic data generation.

Leave a Reply