Fast Practical Compression of Deterministic Finite Automata
Philip Bille, Inge Li G{\o}rtz, Max Rish{\o}j Pedersen

TL;DR
This paper introduces a near-linear time approximation framework for DFA compression algorithms, significantly improving scalability and speed while maintaining comparable compression quality, especially for large intrusion detection system automata.
Contribution
We propose a locality-sensitive hashing-based framework for fast DFA compression, enabling near-linear time approximation of existing algorithms with minimal loss in compression efficiency.
Findings
Up to tenfold faster compression than previous methods
Maintains similar compression sizes with minimal loss
Scales to larger DFA collections in intrusion detection systems
Abstract
We revisit the popular \emph{delayed deterministic finite automaton} (\ddfa{}) compression algorithm introduced by Kumar~et~al.~[SIGCOMM 2006] for compressing deterministic finite automata (DFAs) used in intrusion detection systems. This compression scheme exploits similarities in the outgoing sets of transitions among states to achieve strong compression while maintaining high throughput for matching. The \ddfa{} algorithm and later variants of it, unfortunately, require at least quadratic compression time since they compare all pairs of states to compute an optimal compression. This is too slow and, in some cases, even infeasible for collections of regular expression in modern intrusion detection systems that produce DFAs of millions of states. Our main result is a simple, general framework for constructing \ddfa{} based on locality-sensitive hashing that constructs an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Packet Processing and Optimization · Algorithms and Data Compression · semigroups and automata theory
