An Open and Reproducible Deep Research Agent for Long-Form Question Answering
Ikuya Yamada, Wataru Ikeda, Ko Yoshida, Mengyu Ye, Hinata Sugimoto, Masatoshi Suzuki, Hisanori Ozaki, Jun Suzuki

TL;DR
This paper introduces an open-source deep research system for long-form question answering that combines LLMs with web search and preference tuning, achieving improved answer quality in open-domain settings.
Contribution
It presents a novel open research system integrating open-source LLMs, web search, and preference tuning for enhanced long-form question answering.
Findings
Consistently improves answer quality across clarity, insightfulness, and factuality.
Achieved winning system status in the NeurIPS 2025 MMU-RAG competition.
Demonstrates effectiveness of preference tuning based on LLM-as-a-judge feedback.
Abstract
We present an open deep research system for long-form question answering, selected as a winning system in the text-to-text track of the MMU-RAG competition at NeurIPS 2025. The system combines an open-source large language model (LLM) with an open web search API to perform iterative retrieval, reasoning, and synthesis in real-world open-domain settings. To enhance reasoning quality, we apply preference tuning based on LLM-as-a-judge feedback that evaluates multiple aspects, including clarity, insightfulness, and factuality. Our experimental results show that the proposed method consistently improves answer quality across all three aspects. Our source code is publicly available at https://github.com/efficient-deep-research/efficient-deep-research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Expert finding and Q&A systems · Information Retrieval and Search Behavior
