ProtRLSearch: A Multi-Round Multimodal Protein Search Agent with Large Language Models Trained via Reinforcement Learning
Congying Liu, Taihao Li, Ming Huang, Xingyuan Wei, Peipei Liu, Yiqing Shen, Yanxu Mao, Tiehan Cui

TL;DR
ProtRLSearch introduces a multi-round, multimodal protein search agent trained with reinforcement learning, enabling better reasoning and decision-making in protein analysis tasks involving sequence and text inputs.
Contribution
The paper presents ProtRLSearch, a novel multimodal, multi-round search agent trained via reinforcement learning that incorporates protein sequences and text for improved protein reasoning.
Findings
Achieved high-quality protein reports through multimodal search.
Constructed ProtMCQs benchmark with 3,000 questions for evaluation.
Demonstrated improved reasoning in protein function and phenotype analysis.
Abstract
Protein analysis tasks arising in healthcare settings often require accurate reasoning under protein sequence constraints, involving tasks such as functional interpretation of disease-related variants, protein-level analysis for clinical research, and similar scenarios. To address such tasks, search agents are introduced to search protein-related information, providing support for disease-related variant analysis and protein function reasoning in protein-centric inference. However, such search agents are mostly limited to single-round, text-only modality search, which prevents the protein sequence modality from being incorporated as a multimodal input into the search decision-making process. Meanwhile, their reliance on reinforcement learning (RL) supervision that focuses solely on the final answer results in a lack of search process constraints, making deviations in keyword selection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Rare Diseases · Biomedical Text Mining and Ontologies · Bioinformatics and Genomic Networks
