Beyond Protein Language Models: An Agentic LLM Framework for Mechanistic Enzyme Design
Bruno Jacob, Khushbu Agarwal, Marcel Baer, Peter Rice, Simone Raugei

TL;DR
Genie-CAT is an integrated AI system that combines language reasoning, structural analysis, and physics-based calculations to accelerate and automate hypotheses generation in protein enzyme design, demonstrated on metalloproteins.
Contribution
This work introduces Genie-CAT, a novel agentic LLM framework that unifies reasoning, structural parsing, electrostatics, and machine learning for mechanistic enzyme design.
Findings
Successfully identified residue modifications affecting redox properties.
Reproduced expert hypotheses rapidly and autonomously.
Bridged symbolic reasoning with numerical simulation in protein design.
Abstract
We present Genie-CAT, a tool-augmented large-language-model (LLM) system designed to accelerate scientific hypothesis generation in protein design. Using metalloproteins (e.g., ferredoxins) as a case study, Genie-CAT integrates four capabilities -- literature-grounded reasoning through retrieval-augmented generation (RAG), structural parsing of Protein Data Bank files, electrostatic potential calculations, and machine-learning prediction of redox properties -- into a unified agentic workflow. By coupling natural-language reasoning with data-driven and physics-based computation, the system generates mechanistically interpretable, testable hypotheses linking sequence, structure, and function. In proof-of-concept demonstrations, Genie-CAT autonomously identifies residue-level modifications near [Fe--S] clusters that affect redox tuning, reproducing expert-derived hypotheses in a fraction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning in Materials Science · Metalloenzymes and iron-sulfur proteins · Metal-Catalyzed Oxygenation Mechanisms
