Information theory: Sources, Dirichlet series, and realistic analyses of data structures
Mathieu Roux (LMNO, GREYC, CNRS, University of Caen, France),, Brigitte Vall\'ee (GREYC, CNRS, University of Caen, France)

TL;DR
This paper analyzes data structures built on words emitted by general sources, using an analytic combinatorics approach with Dirichlet series, to provide a realistic and detailed understanding of their probabilistic and algorithmic behavior.
Contribution
It introduces a framework combining dynamical sources and Dirichlet series to analyze data structures with realistic cost models, extending beyond simple source assumptions.
Findings
The Dirichlet series Lambda(s) encodes source properties analytically.
Tameness of sources is characterized by Diophantine conditions on Lambda(s).
The approach links probabilistic source behavior with analytic properties of generating functions.
Abstract
Most of the text algorithms build data structures on words, mainly trees, as digital trees (tries) or binary search trees (bst). The mechanism which produces symbols of the words (one symbol at each unit time) is called a source, in information theory contexts. The probabilistic behaviour of the trees built on words emitted by the same source depends on two factors: the algorithmic properties of the tree, together with the information-theoretic properties of the source. Very often, these two factors are considered in a too simplified way: from the algorithmic point of view, the cost of the Bst is only measured in terms of the number of comparisons between words --from the information theoretic point of view, only simple sources (memoryless sources or Markov chains) are studied. We wish to perform here a realistic analysis, and we choose to deal together with a general source and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
