Efficient Immediate-Access Dynamic Indexing
Alistair Moffat, Joel Mackenzie

TL;DR
This paper introduces a new dynamic indexing structure that enables immediate document ingestion and queryability, with high efficiency, low memory footprint, and fast query processing, suitable for large-scale in-memory collections.
Contribution
It presents a novel compression and extensible list approach that significantly improves dynamic index construction, query speed, and memory efficiency over previous methods.
Findings
In-memory dynamic indexes can be built at 2 GB/minute.
Multi-term conjunctive queries are resolved in a few milliseconds.
Memory usage is reduced to as little as two bytes per posting.
Abstract
In a dynamic retrieval system, documents must be ingested as they arrive, and be immediately findable by queries. Our purpose in this paper is to describe an index structure and processing regime that accommodates that requirement for immediate access, seeking to make the ingestion process as streamlined as possible, while at the same time seeking to make the growing index as small as possible, and seeking to make term-based querying via the index as efficient as possible. We describe a new compression operation and a novel approach to extensible lists which together facilitate that triple goal. In particular, the structure we describe provides incremental document-level indexing using as little as two bytes per posting and only a small amount more for word-level indexing; provides fast document insertion; supports immediate and continuous queryability; provides support for fast…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Data Management and Algorithms · Advanced Database Systems and Queries
