Andersen, E., Chiarandini, M., Hassani, M., Janicke, S., Tampakis, P., & Zimek, A. (2022). Evaluation of Probability Distribution Distance Metrics in Traffic Flow Outlier Detection. In Proceedings – 2022 23rd IEEE International Conference on Mobile Data Management, MDM 2022 (pp. 64-69). Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/MDM55031.2022.00030
Recent approaches have proven the effectiveness of local outlier factor-based outlier detection when applied over traffic flow probability distributions. However, these approaches used distance metrics based on the Bhattacharyya coefficient when calculating probability distribution similarity. Consequently, the limited expressiveness of the Bhattacharyya coefficient restricted the accuracy of the methods. The crucial deficiency of the Bhattacharyya distance metric is its inability to compare distributions with non-overlapping sample spaces over the domain of natural numbers. Traffic flow intensity varies greatly, which results in numerous non-overlapping sample spaces, rendering metrics based on the Bhattacharyya coefficient inappropriate. In this work, we address this issue by exploring alternative distance metrics and showing their applicability in a massive real-life traffic flow data set from 26 vital intersections in The Hague. The results on these data collected from 272 sensors for more than two years show various advantages of the Earth Mover’s distance both in effectiveness and efficiency.