Structure Liberates: How Constrained Sensemaking Produces More Novel Research Output

James Mooney; Zae Myung Kim; Young-Jun Lee; Dongyeop Kang

arXiv:2605.00557·cs.CL·May 4, 2026

Structure Liberates: How Constrained Sensemaking Produces More Novel Research Output

James Mooney, Zae Myung Kim, Young-Jun Lee, Dongyeop Kang

PDF

TL;DR

This paper introduces SCISENSE, a framework and dataset for structured scientific ideation, showing that targeted, constrained sensemaking improves research quality and creativity in LLM-generated trajectories.

Contribution

The paper presents SCISENSE, a novel sensemaking framework, a large dataset of research trajectories, and demonstrates that targeted supervision enhances exploration and downstream research outputs.

Findings

01

Target-trained models outperform Infer-trained models in trajectory quality.

02

Targeted ideation reduces cognitive load and enhances research artifact quality.

03

Sensemaking constraints lead to more novel and diverse research directions.

Abstract

Scientific discovery is an extended process of ideation--surveying prior work, forming hypotheses, and refining reasoning--yet existing approaches treat this phase as a brief preamble despite its central role in research. We introduce SCISENSE, a sensemaking-grounded framework that operationalizes ideation as a structured sequence of eight cognitive stages (Pirolli \& Card, 2005). We construct SCISENSE-Traj, a 100K-scale dataset of citation-conditioned research trajectories in two modes: Target, where an LLM reconstructs the ideation path leading to a known paper from its cited works, and Infer, where the LLM proposes novel directions from the same citations. We distill these into SCISENSE-LM, a family of sensemaking LLMs spanning 3B to 70B parameters. Contrary to the assumption that looser supervision promotes greater exploration, Target-trained models achieve a 2.0\% improvement in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.