Indexing Tries within Entropy-Bounded Space
Lorenzo Carfagna, Carlo Tosoni

TL;DR
This paper introduces a space-efficient, entropy-bounded trie indexing method based on BWT, extending string entropy concepts to tries, and compares its efficiency with existing trie indexes.
Contribution
It develops a succinct trie index using BWT, introduces trie entropy measures, and demonstrates space bounds comparable to string entropy, improving trie indexing efficiency.
Findings
Trie index space bounded by k-th order empirical entropy
Trie entropy measures analogous to string entropy
Trie index can outperform r-index in some cases
Abstract
We study the problem of indexing and compressing tries using a BWT-based approach. Specifically, we consider a succinct and compressed representation of the XBWT of Ferragina et al.\ [FOCS '05, JACM '09] corresponding to the analogous of the FM-index [FOCS '00, JACM '05] for tries. This representation allows to efficiently count the number of nodes reached by a given string pattern. To analyze the space complexity of the above trie index, we propose a proof for the combinatorial problem of counting the number of tries with a given symbol distribution. We use this formula to define a worst-case entropy measure for tries, as well as a notion of k-th order empirical entropy. In particular, we show that the relationships between these two entropy measures are similar to those between the corresponding well-known measures for strings. We use these measures to prove that the XBWT of a trie…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
