# Non-normal limiting distribution for optimal alignment scores of strings   in binary alphabets

**Authors:** Jun Tao Duan, Heinrich Matzinger, Ionel Popescu

arXiv: 1703.05788 · 2017-08-02

## TL;DR

This paper analyzes the distribution of optimal alignment scores for binary strings, revealing a non-normal limiting distribution involving Tracy-Widom and normal components, with implications for string relatedness testing.

## Contribution

It decomposes the alignment score into normal and Tracy-Widom parts, showing the non-normal limiting distribution under specific gap restrictions and alignment conditions.

## Key findings

- Alignment scores have a non-normal limiting distribution involving Tracy-Widom.
- The score decomposes into a normal part and a Tracy-Widom part under certain gap constraints.
- The normal component is irrelevant for relatedness testing as it does not depend on alignment.

## Abstract

We consider two independent binary i.i.d. random strings $X$ and $Y$ of equal length $n$ and the optimal alignments according to a symmetric scoring functions only. We decompose the space of scoring functions into five components. Two of these components add a part to the optimal score which does not depend on the alignment and which is asymptotically normal.   We show that when we restrict the number of gaps sufficiently and add them only into one sequence, then the alignment score can be decomposed into a part which is normal and has order $O(\sqrt{n})$ and a part which is on a smaller order and tends to a Tracy-Widom distribution. Adding gaps only into one sequence is equivalent to aligning a string with its descendants in case of mutations and deletes. For testing relatedness of strings, the normal part is irrelevant, since it does not depend on the alignment hence it can be safely removed from the test statistic.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.05788/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1703.05788/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/1703.05788/full.md

---
Source: https://tomesphere.com/paper/1703.05788