Multidimensional mutual information methods for the analysis of covariation in multiple sequence alignments
Greg W. Clark, Sharon H. Ackerman, Elisabeth R. Tillier, Domenico L., Gatti

TL;DR
This study evaluates multidimensional mutual information methods for analyzing covariation in multiple sequence alignments, comparing their effectiveness to other statistical models and exploring their ability to identify residue contacts and structural features.
Contribution
The paper introduces and tests multidimensional mutual information methods as tools for covariation analysis, demonstrating their comparable performance to existing models and clarifying their ability to distinguish direct from indirect residue correlations.
Findings
Multidimensional MI methods perform comparably to maximum entropy models.
Different covariation detection methods have less than 65% overlap in top pairs.
Partial correlation methods better identify close contacts by filtering out fitness-related correlations.
Abstract
Several methods are available for the detection of covarying positions from a multiple sequence alignment (MSA). If the MSA contains a large number of sequences, information about the proximities between residues derived from covariation maps can be sufficient to predict a protein fold. If the structure is already known, information on the covarying positions can be valuable to understand the protein mechanism. In this study we have sought to determine whether a multivariate extension of traditional mutual information (MI) can be an additional tool to study covariation. The performance of two multidimensional MI (mdMI) methods, designed to remove the effect of ternary/quaternary interdependencies, was tested with a set of 9 MSAs each containing <400 sequences, and was shown to be comparable to that of methods based on maximum entropy/pseudolikelyhood statistical models of protein…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Genomics and Phylogenetic Studies · Enzyme Structure and Function
