LLMs learn scientific taste from institutional traces across the social sciences
Ziqin Gong, Ning Li, Huaikang Zhou

TL;DR
This paper demonstrates that institutional publication records can train AI models to evaluate scientific ideas across social sciences, outperforming existing models and aligning confidence with accuracy.
Contribution
It introduces a method to use institutional publication traces as a training signal for AI evaluators, improving judgment in low-verifiability domains.
Findings
Fine-tuned models exceeded frontier-model performance in multiple disciplines.
Models showed calibrated confidence that correlated with correctness.
High-confidence predictions achieved very high accuracy in triage tasks.
Abstract
Reinforcement-learned reasoning has powered recent AI leaps on verifiable tasks, including mathematics, code, and structure prediction. The harder bottleneck is evaluative judgment in low-verifiability domains, where no oracle anchors reward and the core question is which untested ideas deserve attention. We test whether institutional traces, the record of what fields published, where, and at which tier, can serve as a training signal for AI evaluators. Across eight social science disciplines (psychology, economics, communication, sociology, political science, management, business and finance, public administration), we built held-out four-tier research-pitch benchmarks and supervised-fine-tuned (SFT) LLMs on field-specific publication outcomes. The fine-tuned models cleared the 25 percent chance baseline and exceeded frontier-model performance by wide margins, with best single-model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
