A Distance Measure for Privacy-preserving Process Mining based on Feature Learning
Fabian R\"osel, Stephan A. Fahrenkrog-Petersen, Han van der Aa,, Matthias Weidlich

TL;DR
This paper introduces a novel privacy-preserving process mining method that uses feature learning-based event embeddings to measure trace similarity, resulting in anonymized logs with higher utility for analysis.
Contribution
It proposes a new distance measure based on event embeddings to improve anonymization quality by considering activity semantics.
Findings
Embeddings enable meaningful trace distance measurement.
The proposed method preserves more log utility.
Experiments show improved anonymization outcomes.
Abstract
To enable process analysis based on an event log without compromising the privacy of individuals involved in process execution, a log may be anonymized. Such anonymization strives to transform a log so that it satisfies provable privacy guarantees, while largely maintaining its utility for process analysis. Existing techniques perform anonymization using simple, syntactic measures to identify suitable transformation operations. This way, the semantics of the activities referenced by the events in a trace are neglected, potentially leading to transformations in which events of unrelated activities are merged. To avoid this and incorporate the semantics of activities during anonymization, we propose to instead incorporate a distance measure based on feature learning. Specifically, we show how embeddings of events enable the definition of a distance measure for traces to guide event log…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
