Concept-driven Off Policy Evaluation
Ritam Majumdar, Jack Teversham, Sonali Parbhoo

TL;DR
This paper introduces concept-based estimators for off-policy evaluation that leverage human-interpretable concepts to reduce variance, improve accuracy, and enable targeted interventions, with proven unbiasedness and practical algorithms for learning concepts.
Contribution
It develops a family of concept-based OPE estimators that are unbiased and variance-reducing, and proposes an end-to-end method to learn interpretable concepts for real-world applications.
Findings
Concept-based estimators significantly improve OPE performance.
Learned concepts are interpretable, concise, and diverse.
Estimators enable targeted interventions and are unbiased.
Abstract
Evaluating off-policy decisions using batch data poses significant challenges due to limited sample sizes leading to high variance. To improve Off-Policy Evaluation (OPE), we must identify and address the sources of this variance. Recent research on Concept Bottleneck Models (CBMs) shows that using human-explainable concepts can improve predictions and provide better understanding. We propose incorporating concepts into OPE to reduce variance. Our work introduces a family of concept-based OPE estimators, proving that they remain unbiased and reduce variance when concepts are known and predefined. Since real-world applications often lack predefined concepts, we further develop an end-to-end algorithm to learn interpretable, concise, and diverse parameterized concepts optimized for variance reduction. Our experiments with synthetic and real-world datasets show that both known and learned…
Peer Reviews
Decision·Submitted to ICLR 2025
Main strengths are - Transforming states to concepts is an interesting and refreshing idea among the many variance reduction methods. It may improve applicability of methods to safety-critical domains via improved interpretability and ability to intervene to correct estimates of policy value. - Proposal to use state abstractions was given by Pavse & Hanna 2022a, but the use of concept bottleneck models is novel to the best of my knowledge. Authors show that concepts are more intervenable and all
Main weaknesses in my view are - Presentation could be improved at places by providing more details on how concepts can help improve OPE estimation, definitions of concept-equivalent policies, and interventions on concepts. - Comparisons to previous work on state abstractions by Pavse & Hanna 2022a, either in experiments or in terms of technical and conceptual contributions, should be clearly made. - Some implementation details like the OPE cost in optimization and computing concept-equivalent p
The paper proposes a useful idea in improving the interpretability of dynamics in RL which can give insights into the problem. Section 7 was very useful in illustrating this. The computational results overall are promising and show nice improvements over existing methods.
Theory: Additional discussion would be helpful for the theoretical results to explain the significance of these results. What kind of insights can we gain from the theory? Anything we can apply to improve/guide practical application and experimental results? Additionally, how does the choice on the number of concepts affect the ability of evaluate policies? More concepts may be helpful in better partitioning state space, but overall the process becomes less interpretable if we have too many co
The authors study an important problem – it is well-known that off-policy evaluation estimators can suffer from high variance, so the idea of using concepts as a form of dimensionality reduction of the state-action space can yield empirical benefits.
The paper has some clarity / conceptual issues: The proposal of this paper is to use a concept bottleneck model to learn interpretable concepts and derive importance-weighting estimators based on these concepts. The idea of simplifying the state-action space using concepts is a promising one, but there are many technical details that are not clear from the paper. - The authors posit that a concept at time-step $t$ can be obtained from a function $\phi$ that takes the entire trajectory from time
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvaluation and Performance Assessment
