Computational reconstruction of mitochondria-encoded mammal ancestral proteins
Bohdan Kozarzewski

TL;DR
This paper introduces a novel alignment-free method for reconstructing ancestral mitochondrial proteins in mammals by analyzing pattern sets derived from protein sequences, revealing links to environmental history.
Contribution
It presents a new sequence reconstruction technique that does not require sequence alignment or phylogenetic trees, simplifying ancestral protein inference.
Findings
Reconstructed ancestral sequences for 13 mitochondrial protein families.
Identified patterns linking sequence similarity to historical environmental changes.
Demonstrated the method's effectiveness without traditional phylogenetic analysis.
Abstract
A method based on mapping a symbolic sequence into a set of patterns (strings resulting from the sequence parsing) is proposed as a tool for the reconstruction of ancestral sequences. The set union of patterns comprises all the patterns present in the family of related proteins sequences of an extant species. The set of most frequent patterns among protein sequences is selected and concatenated. The resulting sequence of amino acids is supposed to be the ancestral protein of the family. No sequences alignment and phylogenetic tree of the species family are necessary. The method is used for inferring the ancestral amino acid sequences of thirteen mitochondria-encoded protein families of mammal species. Statistical distribution of the similarity between extant and ancestral sequences exhibits some structures related to environmental changes in the past.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Advanced Proteomics Techniques and Applications · Mitochondrial Function and Pathology
