On the Cost of Unsuccessful Searches in Search Trees with Two-way Comparisons
Marek Chrobak, Mordecai Golin, J. Ian Munro, Neal E. Young

TL;DR
This paper investigates the additional cost of locating non-key queries in static search trees with two-way comparisons, proving it is at most 1, using a novel probabilistic approach.
Contribution
It introduces a new probabilistic method to analyze the cost of non-key query localization in two-way comparison search trees, establishing an upper bound.
Findings
Additional cost for non-key queries is at most 1 in two-way comparison trees.
A novel probabilistic proof technique was developed.
The result applies to static, optimal search trees with known query probabilities.
Abstract
Search trees are commonly used to implement access operations to a set of stored keys. If this set is static and the probabilities of membership queries are known in advance, then one can precompute an optimal search tree, namely one that minimizes the expected access cost. For a non-key query, a search tree can determine its approximate location by returning the inter-key interval containing the query. This is in contrast to other dictionary data structures, like hash tables, that only report a failed search. We address the question "what is the additional cost of determining approximate locations for non-key queries"? We prove that for two-way comparison trees this additional cost is at most 1. Our proof is based on a novel probabilistic argument that involves converting a search tree that does not identify non-key queries into a random tree that does.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
