Solving Maximum Clique Problem for Protein Structure Similarity
No\"el Malod-Dognin (INRIA - Irisa), Rumen Andonov (INRIA - Irisa),, Nicola Yanev

TL;DR
This paper introduces a new integer programming model and a specialized branch and bound algorithm to efficiently solve the maximum clique problem in protein structure similarity, significantly outperforming existing methods.
Contribution
It presents a novel integer programming formulation and a dedicated branch and bound algorithm for protein structure alignment, integrated into the VAST tool.
Findings
Branch and bound algorithm is up to 116 times faster than BK
New model improves efficiency of protein structure similarity computation
Enhanced VAST tool for protein alignment
Abstract
A basic assumption of molecular biology is that proteins sharing close three-dimensional (3D) structures are likely to share a common function and in most cases derive from a same ancestor. Computing the similarity between two protein structures is therefore a crucial task and has been extensively investigated. Evaluating the similarity of two proteins can be done by finding an optimal one-to-one matching between their components, which is equivalent to identifying a maximum weighted clique in a specific "alignment graph". In this paper we present a new integer programming formulation for solving such clique problems. The model has been implemented using the ILOG CPLEX Callable Library. In addition, we designed a dedicated branch and bound algorithm for solving the maximum cardinality clique problem. Both approaches have been integrated in VAST (Vector Alignment Search Tool) - a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Bioinformatics · Protein Structure and Dynamics
