Counting, generating and sampling tree alignments
Cedric Chauve, Julien Courtiel, Yann Ponty (LIX, AMIB)

TL;DR
This paper provides a detailed enumeration and efficient sampling methods for tree alignments, addressing ambiguity issues in traditional representations, with applications in RNA secondary structure comparison.
Contribution
It introduces a precise asymptotic enumeration of tree alignments and a dynamic programming algorithm for sampling alignments, improving probabilistic analysis capabilities.
Findings
Asymptotic enumeration of tree alignments achieved.
Efficient sampling algorithm for alignments developed.
Facilitates probabilistic analysis of RNA secondary structures.
Abstract
Pairwise ordered tree alignment are combinatorial objects that appear in RNA secondary structure comparison. However, the usual representation of tree alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce identical sets of matches between identical pairs of trees. This ambiguity is uninformative, and detrimental to any probabilistic analysis.In this work, we consider tree alignments up to equivalence. Our first result is a precise asymptotic enumeration of tree alignments, obtained from a context-free grammar by mean of basic analytic combinatorics. Our second result focuses on alignments between two given ordered trees and . By refining our grammar to align specific trees, we obtain a decomposition scheme for the space of alignments, and use it to design an efficient dynamic programming algorithm for sampling alignments under the Gibbs-Boltzmann…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · Genomics and Phylogenetic Studies · Algorithms and Data Compression
