Distilling Deep Reinforcement Learning into Interpretable Fuzzy Rules: An Explainable AI Framework

Sanup S. Araballi; Simon Khan; Chilukuri K. Mohan

arXiv:2603.13257·cs.AI·March 17, 2026

Distilling Deep Reinforcement Learning into Interpretable Fuzzy Rules: An Explainable AI Framework

Sanup S. Araballi, Simon Khan, Chilukuri K. Mohan

PDF

Open Access

TL;DR

This paper introduces a fuzzy rule-based framework to interpret deep reinforcement learning policies, improving transparency and fidelity over existing methods, and validated on a lunar lander control task.

Contribution

It presents a hierarchical fuzzy classifier system that distills neural policies into human-readable rules using clustering and regression, with novel metrics for interpretability and fidelity evaluation.

Findings

01

Achieved 81.48% fidelity on Lunar Lander (Continuous)

02

Outperformed decision trees by 21 percentage points

03

Demonstrated statistically significant interpretability improvements

Abstract

Deep Reinforcement Learning (DRL) agents achieve remarkable performance in continuous control but remain opaque, hindering deployment in safety-critical domains. Existing explainability methods either provide only local insights (SHAP, LIME) or employ over-simplified surrogates failing to capture continuous dynamics (decision trees). This work proposes a Hierarchical Takagi-Sugeno-Kang (TSK) Fuzzy Classifier System (FCS) distilling neural policies into human-readable IF-THEN rules through K-Means clustering for state partitioning and Ridge Regression for local action inference. Three quantifiable metrics are introduced: Fuzzy Rule Activation Density (FRAD) measuring explanation focus, Fuzzy Set Coverage (FSC) validating vocabulary completeness, and Action Space Granularity (ASG) assessing control mode diversity. Dynamic Time Warping (DTW) validates temporal behavioral fidelity.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications · Reinforcement Learning in Robotics