Hippo: A Fast, yet Scalable, Database Indexing Approach
Jia Yu, Mohamed Sarwat

TL;DR
Hippo is a novel database indexing method that drastically reduces storage and maintenance costs while maintaining comparable query performance to traditional indexes like B+-Tree, especially beneficial for large-scale data systems.
Contribution
This paper introduces Hippo, a scalable indexing approach that minimizes storage and maintenance overhead without sacrificing query efficiency, unlike traditional indexes.
Findings
Hippo reduces storage space by up to 100 times compared to B+-Tree.
Hippo decreases maintenance overhead by up to 1000 times.
Query performance remains comparable to B+-Tree across various selectivities.
Abstract
Even though existing database indexes (e.g., B+-Tree) speed up the query execution, they suffer from two main drawbacks: (1) A database index usually yields 5% to 15% additional storage overhead which results in non-ignorable dollar cost in big data scenarios especially when deployed on modern storage devices like Solid State Disk (SSD) or Non-Volatile Memory (NVM). (2) Maintaining a database index incurs high latency because the DBMS has to find and update those index pages affected by the underlying table changes. This paper proposes Hippo a fast, yet scalable, database indexing approach. Hippo only stores the pointers of disk pages along with light weight histogram-based summaries. The proposed structure significantly shrinks index storage and maintenance overhead without compromising much on query execution performance. Experiments, based on real Hippo implementation inside…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Algorithms and Data Compression
