A Gaussian limit process for optimal FIND algorithms
Henning Sulzbach, Ralph Neininger, Michael Drmota

TL;DR
This paper analyzes the complexity of median-of-subset FIND algorithms, showing that their normalized complexity converges to a Gaussian process, with detailed covariance and path properties, as data size grows large.
Contribution
It establishes the weak convergence of the FIND algorithm's complexity process to a Gaussian process, providing new insights into its probabilistic behavior.
Findings
Normalized complexity converges to a Gaussian process
Covariance function of the limit process is explicitly identified
Path and tail properties of the Gaussian limit are discussed
Abstract
We consider versions of the FIND algorithm where the pivot element used is the median of a subset chosen uniformly at random from the data. For the median selection we assume that subsamples of size asymptotic to are chosen, where , and is the size of the data set to be split. We consider the complexity of FIND as a process in the rank to be selected and measured by the number of key comparisons required. After normalization we show weak convergence of the complexity to a centered Gaussian process as , which depends on . The proof relies on a contraction argument for probability distributions on c{\`a}dl{\`a}g functions. We also identify the covariance function of the Gaussian limit process and discuss path and tail properties.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoil Geostatistics and Mapping · Data Management and Algorithms · Scientific Research and Discoveries
