A Central Limit Theorem for the Optimal Alignments Score in Multiple   Random Words

Ruoting Gong; Christian Houdr\'e; \"Umit I\c{s}lak

arXiv:1512.05699·math.PR·March 15, 2016·5 cites

A Central Limit Theorem for the Optimal Alignments Score in Multiple Random Words

Ruoting Gong, Christian Houdr\'e, \"Umit I\c{s}lak

PDF

Open Access

TL;DR

This paper proves a central limit theorem for the distribution of the optimal alignment score among multiple independent random sequences over a finite alphabet, under certain regularity conditions.

Contribution

It establishes a CLT for the optimal alignment score of multiple random words, extending previous results to a broader setting with fewer restrictions.

Findings

01

CLT holds for the optimal alignment score under variance lower-bound

02

Score distribution converges to a normal distribution

03

Applicable to permutation-invariant, bounded score functions

Abstract

Let $X_{n}^{(1)}, \dots, X_{n}^{(m)}$ , where $X_{n}^{(i)} = (X_{1}^{(i)}, \dots, X_{n}^{(i)})$ , $i = 1, \dots, m$ , be $m$ independent sequences of independent and identically distributed random variables taking their values in a finite alphabet $A$ . Let the score function $S$ , defined on $A^{m}$ , be non-negative, bounded, permutation-invariant, and satisfy a bounded differences condition. Under a variance lower-bound assumption, a central limit theorem is proved for the optimal alignments score of the $m$ random words.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRandom Matrices and Applications · Bayesian Methods and Mixture Models · Authorship Attribution and Profiling