Online Self-Indexed Grammar Compression
Yoshimasa Takabatake, Yasuo Tabei, Hiroshi Sakamoto

TL;DR
This paper introduces OESP-index, an online self-indexed grammar compression method that incrementally builds index structures from input characters, reducing memory usage and improving efficiency in big data applications.
Contribution
The paper presents the first online self-indexed grammar compression technique capable of incremental index construction from streaming input.
Findings
OESP-index efficiently builds index structures in an online manner.
It significantly reduces working space during index construction.
Experimental results demonstrate high space-efficiency and effective search capabilities.
Abstract
Although several grammar-based self-indexes have been proposed thus far, their applicability is limited to offline settings where whole input texts are prepared, thus requiring to rebuild index structures for given additional inputs, which is often the case in the big data era. In this paper, we present the first online self-indexed grammar compression named OESP-index that can gradually build the index structure by reading input characters one-by-one. Such a property is another advantage which enables saving a working space for construction, because we do not need to store input texts in memory. We experimentally test OESP-index on the ability to build index structures and search query texts, and we show OESP-index's efficiency, especially space-efficiency for building index structures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Natural Language Processing Techniques · DNA and Biological Computing
