A Critical Analysis of Recursive Model Indexes
Marcel Maltry, Jens Dittrich

TL;DR
This paper provides a broad analysis of recursive model indexes (RMIs), identifying key hyperparameters affecting performance, and offers a simple configuration guideline that achieves competitive results with less tuning effort.
Contribution
It is the first broad, inventor-independent analysis of RMIs, revealing the importance of hyperparameters and providing a practical guideline for effective configuration.
Findings
Hyperparameters like model type, layer size, error bounds, and search algorithms significantly impact RMI performance.
The proposed guideline achieves performance comparable to state-of-the-art RMIs with less tuning.
Reimplementation of RMIs improves build time by up to 6.3 times.
Abstract
The recursive model index (RMI) has recently been introduced as a machine-learned replacement for traditional indexes over sorted data, achieving remarkably fast lookups. Follow-up work focused on explaining RMI's performance and automatically configuring RMIs through enumeration. Unfortunately, configuring RMIs involves setting several hyperparameters, the enumeration of which is often too time-consuming in practice. Therefore, in this work, we conduct the first inventor-independent broad analysis of RMIs with the goal of understanding the impact of each hyperparameter on performance. In particular, we show that in addition to model types and layer size, error bounds and search algorithms must be considered to achieve the best possible performance. Based on our findings, we develop a simple-to-follow guideline for configuring RMIs. We evaluate our guideline by comparing the resulting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Data Management and Algorithms · Bayesian Modeling and Causal Inference
