The MIp Toolset: an efficient algorithm for calculating Mutual Information in protein alignments
Russell J. Dickson, Gregory B. Gloor

TL;DR
The paper introduces an efficient algorithm for calculating Mutual Information in protein alignments, enabling large-scale and real-time coevolution analysis within protein families.
Contribution
It presents a novel linked list-based algorithm for faster Mutual Information computation, implemented in the MIpToolset software for protein coevolution studies.
Findings
Enables large-scale protein family analysis.
Supports real-time coevolution calculations.
Improves efficiency over previous methods.
Abstract
Background: Coevolution within a protein family is often predicted using statistics that measure the degree of covariation between positions in the protein sequence. Mutual Information is a measure of dependence between two random variables that has been used extensively to predict intra-protein coevolution. Results: Here we provide an algorithm for the efficient calculation of Mutual Information within a protein family. The algorithm uses linked lists which are directly accessed by a pointer array. The linked list allows efficient storage of sparse count data caused by protein conservation. The direct access array of pointers prevents the linked list from being traversed each time it is modified. Conclusions: This algorithm is implemented in the software MIpToolset, but could also be easily implemented in other Mutual Information based standalone software or web servers. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Advanced Proteomics Techniques and Applications · Bioinformatics and Genomic Networks
