The total path length of split trees
Nicolas Broutin, Cecilia Holmgren

TL;DR
This paper analyzes the total path length in various split trees, proving its convergence in distribution and providing a unified theoretical framework for many data structures using renewal theory.
Contribution
It introduces a unified approach using renewal theory to analyze the total path length in a broad class of split trees, covering many data structures.
Findings
Convergence in distribution of total path length is established.
Results apply to binary search trees, m-ary search trees, and more.
Provides a fixed point equation characterizing the limit distribution.
Abstract
We consider the model of random trees introduced by Devroye [SIAM J. Comput. 28 (1999) 409-432]. The model encompasses many important randomized algorithms and data structures. The pieces of data (items) are stored in a randomized fashion in the nodes of a tree. The total path length (sum of depths of the items) is a natural measure of the efficiency of the algorithm/data structure. Using renewal theory, we prove convergence in distribution of the total path length toward a distribution characterized uniquely by a fixed point equation. Our result covers, using a unified approach, many data structures such as binary search trees, m-ary search trees, quad trees, median-of-(2k+1) trees, and simplex trees.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
