SeRTS: Self-Rewarding Tree Search for Biomedical Retrieval-Augmented Generation
Minda Hu, Licheng Zong, Hongru Wang, Jingyan Zhou, Jingjing Li, Yichen, Gao, Kam-Fai Wong, Yu Li, Irwin King

TL;DR
SeRTS introduces a novel self-rewarding tree search method that enhances biomedical document retrieval in LLM-based RAG systems, significantly improving accuracy and efficiency for medical knowledge queries.
Contribution
The paper presents SeRTS, a plug-and-play LLM retrieval method combining Monte Carlo Tree Search and self-rewarding paradigms, improving zero-shot retrieval and fine-tuning with PPO for biomedical QA.
Findings
SeRTS outperforms baseline retrieval methods on BioASQ-QA dataset.
SeRTS improves LLMs' ability to retrieve relevant biomedical documents.
SeRTS generates higher-quality feedback for PPO training.
Abstract
Large Language Models (LLMs) have shown great potential in the biomedical domain with the advancement of retrieval-augmented generation (RAG). However, existing retrieval-augmented approaches face challenges in addressing diverse queries and documents, particularly for medical knowledge queries, resulting in sub-optimal performance. To address these limitations, we propose a novel plug-and-play LLM-based retrieval method called Self-Rewarding Tree Search (SeRTS) based on Monte Carlo Tree Search (MCTS) and a self-rewarding paradigm. By combining the reasoning capabilities of LLMs with the effectiveness of tree search, SeRTS boosts the zero-shot performance of retrieving high-quality and informative results for RAG. We further enhance retrieval performance by fine-tuning LLMs with Proximal Policy Optimization (PPO) objectives using the trajectories collected by SeRTS as feedback.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Artificial Intelligence in Healthcare
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · WordPiece · Linear Warmup With Linear Decay · Cosine Annealing · BART · BERT · Residual Connection · Softmax · Layer Normalization
