GIANTS: Generative Insight Anticipation from Scientific Literature

Joy He-Yueya; Anikait Singh; Ge Gao; Michael Y. Li; Sherry Yang; Chelsea Finn; Emma Brunskill; Noah D. Goodman

arXiv:2604.09793·cs.CL·April 14, 2026

GIANTS: Generative Insight Anticipation from Scientific Literature

Joy He-Yueya, Anikait Singh, Ge Gao, Michael Y. Li, Sherry Yang, Chelsea Finn, Emma Brunskill, Noah D. Goodman

PDF

1 Repo

TL;DR

This paper introduces insight anticipation, a task where language models predict the core insights of future scientific papers from their references, supported by a new benchmark and a specialized model that outperforms baselines.

Contribution

The authors present GiantsBench, a large benchmark for insight prediction, and GIANTS-4B, a reinforcement learning-trained model that improves insight quality and generalizes across domains.

Findings

01

GIANTS-4B outperforms proprietary baselines with a 34% relative improvement in similarity scores.

02

Insights generated by GIANTS-4B are more conceptually clear according to human evaluation.

03

Third-party model SciJudge-30B predicts insights from GIANTS-4B are more likely to lead to higher citations.

Abstract

Scientific breakthroughs often emerge from synthesizing prior ideas into novel contributions. While language models (LMs) show promise in scientific discovery, their ability to perform this targeted, literature-grounded synthesis remains underexplored. We introduce insight anticipation, a generation task in which a model predicts a downstream paper's core insight from its foundational parent papers. To evaluate this capability, we develop GiantsBench, a benchmark of 17k examples across eight scientific domains, where each example consists of a set of parent papers paired with the core insight of a downstream paper. We evaluate models using an LM judge that scores similarity between generated and ground-truth insights, and show that these similarity scores correlate with expert human ratings. Finally, we present GIANTS-4B, an LM trained via reinforcement learning (RL) to optimize insight…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

joyheyueya/giants
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.