EnzyCLIP: A Cross-Attention Dual Encoder Framework with Contrastive Learning for Predicting Enzyme Kinetic Constants
Anas Aziz Khan, Md Shah Fahad, Priyanka, Ramesh Chandra, Guransh Singh

TL;DR
EnzyCLIP is a novel dual-encoder framework that uses contrastive learning and cross-attention to accurately predict enzyme kinetic constants from protein and substrate data, advancing computational enzyme analysis.
Contribution
This work introduces EnzyCLIP, combining multimodal embeddings with a CLIP-inspired architecture for joint enzyme kinetic parameter prediction, a novel approach in the field.
Findings
Achieved R2 scores of 0.593 for Kcat and 0.607 for Km prediction.
XGBoost further improved Km prediction to R2 = 0.61.
Demonstrated effective modeling of enzyme-substrate interactions using multimodal contrastive learning.
Abstract
Accurate prediction of enzyme kinetic parameters is crucial for drug discovery, metabolic engineering, and synthetic biology applications. Current computational approaches face limitations in capturing complex enzyme-substrate interactions and often focus on single parameters while neglecting the joint prediction of catalytic turnover numbers (Kcat) and Michaelis-Menten constants (Km). We present EnzyCLIP, a novel dual-encoder framework that leverages contrastive learning and cross-attention mechanisms to predict enzyme kinetic parameters from protein sequences and substrate molecular structures. Our approach integrates ESM-2 protein language model embeddings with ChemBERTa chemical representations through a CLIP-inspired architecture enhanced with bidirectional cross-attention for dynamic enzyme-substrate interaction modeling. EnzyCLIP combines InfoNCE contrastive loss with Huber…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Protein Structure and Dynamics
