Compressing integer lists with Contextual Arithmetic Trits

Yann Barsamian; Andr\'e Chailloux

arXiv:2209.02089·cs.DB·May 6, 2025·1 cites

Compressing integer lists with Contextual Arithmetic Trits

Yann Barsamian, Andr\'e Chailloux

PDF

Open Access

TL;DR

This paper introduces a novel trit encoding method combined with contextual techniques for compressing inverted indexes, outperforming standard methods in compression size across various datasets.

Contribution

The paper presents a new compression approach for inverted indexes using contextual arithmetic trits, demonstrating consistent size improvements over the Binary Interpolative Method.

Findings

01

Outperforms Binary Interpolative Method in compression size

02

Effective across diverse datasets

03

Provides open-source code and datasets

Abstract

Inverted indexes allow to query large databases without needing to search in the database at each query. An important line of research is to construct the most efficient inverted indexes, both in terms of compression ratio and time efficiency. In this article, we show how to use trit encoding, combined with contextual methods for computing inverted indexes. We perform an extensive study of different variants of these methods and show that our method consistently outperforms the Binary Interpolative Method -- which is one of the golden standards in this topic -- with respect to compression size. We apply our methods to a variety of datasets and make available the source code that produced the results, together with all our datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Algorithms and Data Compression