Speak to a Protein: An Interactive Multimodal Co-Scientist for Protein Analysis
Carles Navarro, Mariona Torrens, Philipp Th\"olke, Stefan Doerr, Gianni De Fabritiis

TL;DR
This paper presents 'Speak to a Protein,' an interactive AI system that enables real-time, multimodal protein analysis through dialogue, visualization, and code generation, significantly simplifying complex structural biology tasks.
Contribution
It introduces a novel interactive AI platform that integrates literature, structural data, visualization, and coding to facilitate protein analysis and hypothesis testing.
Findings
Reduces analysis time from question to evidence.
Enables hypothesis generation through integrated multimodal tools.
Accessible via a free online platform.
Abstract
Building a working mental model of a protein typically requires weeks of reading, cross-referencing crystal and predicted structures, and inspecting ligand complexes, an effort that is slow, unevenly accessible, and often requires specialized computational skills. We introduce \emph{Speak to a Protein}, a new capability that turns protein analysis into an interactive, multimodal dialogue with an expert co-scientist. The AI system retrieves and synthesizes relevant literature, structures, and ligand data; grounds answers in a live 3D scene; and can highlight, annotate, manipulate and see the visualization. It also generates and runs code when needed, explaining results in both text and graphics. We demonstrate these capabilities on relevant proteins, posing questions about binding pockets, conformational changes, or structure-activity relationships to test ideas in real-time. \emph{Speak…
Peer Reviews
Decision·ICLR 2026 Conference Desk Rejected Submission
The proposed system looks well designed and created. The overall UI and the motivation behind the system is well supported by the industry as far as I know. The explanation is clear.
I don't see any particular weakness besides the paper is about a system, rather than a research and experiment. I think it is a great system, but it is hard to determine the value to present this as a paper in a conference. What would be the learning for the readers? It would be a different story if the product is open-sourced, but it doesn't seem to be the case.
- This paper takes a leap in the enabling a close integration between natural language, code execution, and 3D visualization. The ambition of this work is quite a leap over previous works like ProteinChat, and ChatMol Copilot. - The work provides a fully developed prototype that is accessible for people to test online. - Appendix provides traces, which is commendable.
First, I should say that I don't have biochemistry or deep protein knowledge to be able to judge the case studies. In this case, my review and discussion here is more suited to the system design. I think this work should be better positioned as a systems design paper instead of a methods paper. - Since I cannot judge the degree to which the case studies provide novel or difficult to find insight, I would've liked to see some form of quantitative measure of success of the system other than such
1. The proposed system provides an intuitive and easy-to-use interface, making protein analysis more accessible to a broad range of users. 2. The paper is clearly written and well organized, which facilitates understanding of the system design and experimental results.
The main weakness of this paper lies in its limited scientific contribution. The work is primarily an engineering effort, focusing on system design rather than addressing or resolving a concrete scientific problem. Moreover, based on the presented experiments, the proposed “co-scientist” agent functions mainly as a research assistant, facilitating existing workflows rather than contributing to genuine scientific discovery. It remains unclear whether the system is capable of generating new scient
The “Speak to a Protein” system demonstrates originality in protein analysis by integrating natural language processing, real-time 3D visualization, and automated code execution into a unique interactive framework. Compared to existing systems, its capabilities in grounding natural language queries in real-time 3D molecular scenes and automating code execution are particularly distinctive. The system exhibits high quality, effectively coordinating diverse tools through the Model Context Protocol
- The current paper reads more like a technical demonstration (demo paper). It primarily focuses on showcasing system capabilities, such as multimodal interaction and 3D visualization, while offering limited discussion of novel underlying algorithms or model architectures. - Although “Speak to a Protein” integrates multimodal functionalities, its core concept, language-driven protein analysis, has already been explored in prior work, such as ProteinChat and Prot2Chat. The authors should clearly
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Data Visualization and Analytics · Scientific Computing and Data Management
