Interpretable by Design: Query-Specific Neural Modules for Explainable Reinforcement Learning

Mehrdad Zakershahrak

arXiv:2511.08749·cs.AI·November 13, 2025

Interpretable by Design: Query-Specific Neural Modules for Explainable Reinforcement Learning

Mehrdad Zakershahrak

PDF

Open Access

TL;DR

This paper introduces Query Conditioned Deterministic Inference Networks (QDIN), a novel RL architecture that explicitly models diverse queries about the environment, improving interpretability and knowledge extraction without sacrificing control performance.

Contribution

The paper proposes a unified neural architecture for RL that treats various environment queries as first-class citizens, enabling better interpretability and knowledge extraction.

Findings

01

Inference accuracy can reach 99% for reachability despite low control performance.

02

Query-specific architectures outperform unified models and post-hoc methods.

03

Representations for world knowledge differ from those for control.

Abstract

Reinforcement learning has traditionally focused on a singular objective: learning policies that select actions to maximize reward. We challenge this paradigm by asking: what if we explicitly architected RL systems as inference engines that can answer diverse queries about their environment? In deterministic settings, trained agents implicitly encode rich knowledge about reachability, distances, values, and dynamics - yet current architectures are not designed to expose this information efficiently. We introduce Query Conditioned Deterministic Inference Networks (QDIN), a unified architecture that treats different types of queries (policy, reachability, paths, comparisons) as first-class citizens, with specialized neural modules optimized for each inference pattern. Our key empirical finding reveals a fundamental decoupling: inference accuracy can reach near-perfect levels (99%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Reinforcement Learning in Robotics