Distilling Deep RL Models Into Interpretable Neuro-Fuzzy Systems
Arne Gevaert, Jonathan Peck, Yvan Saeys

TL;DR
This paper introduces a method to convert deep reinforcement learning policies into compact, interpretable neuro-fuzzy controllers, maintaining high performance with significantly fewer rules.
Contribution
The authors develop a distillation algorithm that transforms deep Q-network policies into small neuro-fuzzy controllers, enhancing interpretability without sacrificing performance.
Findings
Nearly match DQN performance with only 2-6 fuzzy rules
Effective in three OpenAI Gym environments
Enables interpretable RL solutions
Abstract
Deep Reinforcement Learning uses a deep neural network to encode a policy, which achieves very good performance in a wide range of applications but is widely regarded as a black box model. A more interpretable alternative to deep networks is given by neuro-fuzzy controllers. Unfortunately, neuro-fuzzy controllers often need a large number of rules to solve relatively simple tasks, making them difficult to interpret. In this work, we present an algorithm to distill the policy from a deep Q-network into a compact neuro-fuzzy controller. This allows us to train compact neuro-fuzzy controllers through distillation to solve tasks that they are unable to solve directly, combining the flexibility of deep reinforcement learning and the interpretability of compact rule bases. We demonstrate the algorithm on three well-known environments from OpenAI Gym, where we nearly match the performance of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Fuzzy Logic and Control Systems · Explainable Artificial Intelligence (XAI)
MethodsDense Connections · Q-Learning · Convolution · Deep Q-Network
