Difference-Huffman Coding of Multidimensional Databases
Istv\'an Sz\'epk\'uti

TL;DR
This paper introduces difference-Huffman coding (DHC), a new compression technique for multidimensional databases, demonstrating its superior compression and retrieval performance compared to existing methods through empirical validation.
Contribution
The paper presents difference-Huffman coding (DHC), a novel compression method that outperforms previous techniques in reducing storage size and improving retrieval times in multidimensional databases.
Findings
DHC achieves smaller physical representations than existing techniques.
Multidimensional representation consistently outperforms table representation in retrieval speed.
Caching significantly influences retrieval time, with models confirming this effect.
Abstract
A new compression method called difference-Huffman coding (DHC) is introduced in this paper. It is verified empirically that DHC results in a smaller multidimensional physical representation than those for other previously published techniques (single count header compression, logical position compression, base-offset compression and difference sequence compression). The article examines how caching influences the expected retrieval time of the multidimensional and table representations of relations. A model is proposed for this, which is then verified with empirical data. Conclusions are drawn, based on the model and the experiment, about when one physical representation outperforms another in terms of retrieval time. Over the tested range of available memory, the performance for the multidimensional representation was always much quicker than for the table representation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Data Management and Algorithms · Advanced Database Systems and Queries
