Towards Structural Classification of Proteins based on Contact Map Overlap
Rumen Andonov (IRISA), Nicola Yanev, No\"el Malod-Dognin (IRISA)

TL;DR
This paper introduces a new exact algorithm for contact map overlap maximization in protein structures, significantly improving performance on small and large datasets, enabling better classification of proteins based on 3D structure similarity.
Contribution
It presents a novel integer programming model and an exact branch-and-bound algorithm with bounds from Lagrangian relaxation, outperforming existing methods on benchmark datasets.
Findings
Successfully solved hard CMO instances for the first time.
Achieved better bounds and faster solutions on the Skolnick set.
Produced protein classification results in strong agreement with SCOP.
Abstract
A multitude of measures have been proposed to quantify the similarity between protein 3-D structure. Among these measures, contact map overlap (CMO) maximization deserved sustained attention during past decade because it offers a fine estimation of the natural homology relation between proteins. Despite this large involvement of the bioinformatics and computer science community, the performance of known algorithms remains modest. Due to the complexity of the problem, they got stuck on relatively small instances and are not applicable for large scale comparison. This paper offers a clear improvement over past methods in this respect. We present a new integer programming model for CMO and propose an exact B &B algorithm with bounds computed by solving Lagrangian relaxation. The efficiency of the approach is demonstrated on a popular small benchmark (Skolnick set, 40 domains). On this set…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Advanced Proteomics Techniques and Applications · Ubiquitin and proteasome pathways
