String Sanitization Under Edit Distance: Improved and Generalized
Takuya Mieno, Solon P. Pissis, Leen Stougie, Michelle, Sweering

TL;DR
This paper introduces improved algorithms for the String Sanitization Under Edit Distance (ETFS) problem, enabling privacy-preserving string transformations with minimal edit distance, and extends solutions to a more general case with arbitrary substring lengths.
Contribution
The paper presents faster algorithms for ETFS and its generalization AETFS, achieving near-optimal time complexity under the Strong Exponential Time Hypothesis (SETH).
Findings
New $ ilde{O}(n^2)$-time algorithms for ETFS and AETFS.
Algorithms are optimal up to polylogarithmic factors assuming SETH.
Techniques may be applicable to problems involving regular expressions or context-free grammars.
Abstract
Let be a string of length over an alphabet , be a positive integer, and be a set of length- substrings of . The ETFS problem asks us to construct a string such that: (i) no string of occurs in ; (ii) the order of all other length- substrings over (and thus the frequency) is the same in and in ; and (iii) has minimal edit distance to . When represents an individual's data and represents a set of confidential patterns, the ETFS problem asks for transforming to preserve its privacy and its utility [Bernardini et al., ECML PKDD 2019]. ETFS can be solved in time [Bernardini et al., CPM 2020]. The same paper shows that ETFS cannot be solved in time, for any , unless the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
