Reinforcement Learning Your Way: Agent Characterization through Policy Regularization
Charl Maree, Christian Omlin

TL;DR
This paper introduces a novel regularization technique for reinforcement learning that embeds specific behavioral characteristics into agents' policies during training, enhancing interpretability and understanding of their actions.
Contribution
It proposes a new method to intrinsically characterize agent behavior through policy regularization, linking learning and explainability.
Findings
Method effectively guides agent behavior during learning.
Empirical results support the viability of the approach.
Connects policy regularization with model interpretability.
Abstract
The increased complexity of state-of-the-art reinforcement learning (RL) algorithms have resulted in an opacity that inhibits explainability and understanding. This has led to the development of several post-hoc explainability methods that aim to extract information from learned policies thus aiding explainability. These methods rely on empirical observations of the policy and thus aim to generalize a characterization of agents' behaviour. In this study, we have instead developed a method to imbue a characteristic behaviour into agents' policies through regularization of their objective functions. Our method guides the agents' behaviour during learning which results in an intrinsic characterization; it connects the learning process with model explanation. We provide a formal argument and empirical evidence for the viability of our method. In future work, we intend to employ it to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
