Short-Long Policy Evaluation with Novel Actions
Hyunji Alex Nam, Yash Chandak, Emma Brunskill

TL;DR
This paper introduces a new method for quickly evaluating the long-term effects of decision policies using prior data, significantly reducing the need for long-term observations in various applications.
Contribution
The paper proposes a novel short-long policy evaluation framework that leverages existing data to estimate long-term outcomes without extended observation periods.
Findings
Outperforms previous methods on HIV treatment, kidney dialysis, and battery charging simulators.
Enables rapid identification of potentially inferior new policies for AI safety.
Reduces time and resources needed for long-term policy evaluation.
Abstract
From incorporating LLMs in education, to identifying new drugs and improving ways to charge batteries, innovators constantly try new strategies in search of better long-term outcomes for students, patients and consumers. One major bottleneck in this innovation cycle is the amount of time it takes to observe the downstream effects of a decision policy that incorporates new interventions. The key question is whether we can quickly evaluate long-term outcomes of a new decision policy without making long-term observations. Organizations often have access to prior data about past decision policies and their outcomes, evaluated over the full horizon of interest. Motivated by this, we introduce a new setting for short-long policy evaluation for sequential decision making tasks. Our proposed methods significantly outperform prior results on simulators of HIV treatment, kidney dialysis and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Reinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)
