Augmenting Researchy Questions with Sub-question Judgments

Jia-Huei Ju; Eugene Yang; Trevor Adriaanse; Andrew Yates

arXiv:2510.21733·cs.IR·October 28, 2025

Augmenting Researchy Questions with Sub-question Judgments

Jia-Huei Ju, Eugene Yang, Trevor Adriaanse, Andrew Yates

PDF

TL;DR

This paper enhances the Researchy Questions dataset by adding LLM-judged labels for sub-questions, aiming to improve retrieval models for complex information needs.

Contribution

It introduces a method to augment existing datasets with LLM-generated labels for sub-questions, facilitating better training of retrieval systems.

Findings

01

Sub-question labels generated using Llama3.3 70B model.

02

Enhanced dataset supports training for complex retrieval tasks.

03

Improved annotation quality for sub-questions in research datasets.

Abstract

The Researchy Questions dataset provides about 100k question queries with complex information needs that require retrieving information about several aspects of a topic. Each query in ResearchyQuestions is associated with sub-questions that were produced by prompting GPT-4. While ResearchyQuestions contains labels indicating what documents were clicked after issuing the query, there are no associations in the dataset between sub-questions and relevant documents. In this work, we augment the Researchy Questions dataset with LLM-judged labels for each sub-question using a Llama3.3 70B model. We intend these sub-question labels to serve as a resource for training retrieval models that better support complex information needs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.