The total path length of split trees

Nicolas Broutin; Cecilia Holmgren

arXiv:1102.2541·math.PR·November 5, 2012

The total path length of split trees

Nicolas Broutin, Cecilia Holmgren

PDF

TL;DR

This paper analyzes the total path length in various split trees, proving its convergence in distribution and providing a unified theoretical framework for many data structures using renewal theory.

Contribution

It introduces a unified approach using renewal theory to analyze the total path length in a broad class of split trees, covering many data structures.

Findings

01

Convergence in distribution of total path length is established.

02

Results apply to binary search trees, m-ary search trees, and more.

03

Provides a fixed point equation characterizing the limit distribution.

Abstract

We consider the model of random trees introduced by Devroye [SIAM J. Comput. 28 (1999) 409-432]. The model encompasses many important randomized algorithms and data structures. The pieces of data (items) are stored in a randomized fashion in the nodes of a tree. The total path length (sum of depths of the items) is a natural measure of the efficiency of the algorithm/data structure. Using renewal theory, we prove convergence in distribution of the total path length toward a distribution characterized uniquely by a fixed point equation. Our result covers, using a unified approach, many data structures such as binary search trees, m-ary search trees, quad trees, median-of-(2k+1) trees, and simplex trees.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.