A successive sub-grouping method for multiple sequence alignments analysis
Stefano Marino

TL;DR
This paper introduces a deterministic, property-based sub-grouping method for multiple sequence alignment that clusters amino acids based on chemical or structural properties, offering an alternative to substitution matrix approaches.
Contribution
It presents a novel successive sub-grouping approach for amino acids in sequence alignment, using user-defined or default property schemes, implemented in Python and tested on biological cases.
Findings
Method provides a composite score based on amino acid similarity at each position.
Different property schemes can be used for amino acid clustering.
The approach has been tested and benchmarked on biological data.
Abstract
A novel approach to protein multiple sequence alignment is discussed: substantially this method counterparts with substitution matrix based methods (like Blosum or PAM based methods), and implies a more deterministic approach to chemical/physical sub-grouping of amino acids . Amino acids (aa) are divided into sub-groups with successive derivations, that result in a clustering based on the considered property. The properties can be user defined or chosen between default schemes, like those used in the analysis described here. Starting from an initial set of the 20 naturally occurring amino acids, they are successively divided on the basis of their polarity/hydrophobic index, with increasing resolution up to four level of subdivision. Other schemes of subdivision are possible: in this thesis work it was employed also a scheme based on physical/structural properties (solvent exposure,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Enzyme Structure and Function · Genomics and Phylogenetic Studies
