Prism: Policy Reuse via Interpretable Strategy Mapping in Reinforcement Learning

Thomas Pravetz

arXiv:2604.02353·cs.LG·April 6, 2026

Prism: Policy Reuse via Interpretable Strategy Mapping in Reinforcement Learning

Thomas Pravetz

PDF

TL;DR

PRISM introduces a method for reinforcement learning agents to interpret and transfer strategies using causally validated concepts, enabling zero-shot policy reuse across different algorithms.

Contribution

It develops a framework that clusters agent features into concepts, validates their causal role, and aligns them for effective zero-shot transfer of strategic knowledge.

Findings

01

Causal intervention confirms concepts directly influence agent actions.

02

Concept alignment enables successful zero-shot transfer in Go.

03

The approach is effective in domains with naturally discrete strategic states.

Abstract

We present PRISM (Policy Reuse via Interpretable Strategy Mapping), a framework that grounds reinforcement learning agents' decisions in discrete, causally validated concepts and uses those concepts as a zero-shot transfer interface between agents trained with different algorithms. PRISM clusters each agent's encoder features into $K$ concepts via K-means. Causal intervention establishes that these concepts directly drive - not merely correlate with - agent behavior: overriding concept assignments changes the selected action in 69.4% of interventions ( $p = 8.6 \times 1 0^{- 86}$ , 2500 interventions). Concept importance and usage frequency are dissociated: the most-used concept (C47, 33.0% frequency) causes only a 9.4% win-rate drop when ablated, while ablating C16 (15.4% frequency) collapses win rate from 100% to 51.8%. Because concepts causally encode strategy, aligning them via optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.