Accelerating Social Science Research via Agentic Hypothesization and Experimentation
Jishu Sen Gupta, Harini SI, Somesh Kumar Singh, Syed Mohamad Tawseeq, Yaman Kumar Singla, David Doermann, Rajiv Ratn Shah, Balaji Krishnamurthy

TL;DR
This paper introduces EXPERIGEN, an end-to-end agentic framework for accelerating social science discovery by generating and empirically validating hypotheses, significantly improving statistical significance, novelty, and real-world validation.
Contribution
The paper presents EXPERIGEN, a novel Bayesian optimization-based framework that automates hypothesis generation and testing, enabling faster and more effective social science research.
Findings
Discoveries are 2-4x more statistically significant
Hypotheses are 7-17% more predictive than prior methods
88% of hypotheses are rated as moderately or strongly novel
Abstract
Data-driven social science research is inherently slow, relying on iterative cycles of observation, hypothesis generation, and experimental validation. While recent data-driven methods promise to accelerate parts of this process, they largely fail to support end-to-end scientific discovery. To address this gap, we introduce EXPERIGEN, an agentic framework that operationalizes end-to-end discovery through a Bayesian optimization inspired two-phase search, in which a Generator proposes candidate hypotheses and an Experimenter evaluates them empirically. Across multiple domains, EXPERIGEN consistently discovers 2-4x more statistically significant hypotheses that are 7-17 percent more predictive than prior approaches, and naturally extends to complex data regimes including multimodal and relational datasets. Beyond statistical performance, hypotheses must be novel, empirically grounded, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Data Analysis with R · Opinion Dynamics and Social Influence
