Mining Causality: AI-Assisted Search for Instrumental Variables
Sukjin Han

TL;DR
This paper introduces a novel approach using large language models to automate the search for instrumental variables and other causal inference tools, significantly speeding up the process and exploring larger search spaces.
Contribution
It demonstrates how to leverage LLMs with specialized prompting strategies to identify valid IVs and control variables, enhancing causal inference methods.
Findings
LLMs can effectively identify potential IVs in economic case studies.
Multi-step role-playing prompts improve the relevance of LLM-generated variables.
The approach extends to control variables and running variables in various causal inference designs.
Abstract
The instrumental variables (IVs) method is a leading empirical strategy for causal inference. Finding IVs is a heuristic and creative process, and justifying its validity -- especially exclusion restrictions -- is largely rhetorical. We propose using large language models (LLMs) to search for new IVs through narratives and counterfactual reasoning, similar to how a human researcher would. The stark difference, however, is that LLMs can dramatically accelerate this process and explore an extremely large search space. We demonstrate how to construct prompts to search for potentially valid IVs. We contend that multi-step and role-playing prompting strategies are effective for simulating the endogenous decision-making processes of economic agents and for navigating language models through the realm of real-world scenarios, rather than anchoring them within the narrow realm of academic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Rough Sets and Fuzzy Logic
