Intrinsic Vicarious Conditioning for Deep Reinforcement Learning
Rodney A Sanchez, Ferat Sahin, Alex Ororbia, Jamison Heard

TL;DR
This paper introduces vicarious conditioning as an intrinsic reward mechanism in deep reinforcement learning, enabling agents to learn from others without direct access to policies or reward functions, supporting low-shot and continual learning.
Contribution
It proposes a biologically-inspired vicarious conditioning framework that overcomes direct sampling limitations and demonstrates improved performance in complex environments.
Findings
Supports low-shot learning without needing demonstrator policies.
Enables longer episodes by discouraging non-descriptive terminal states.
Improves agent guidance toward desirable states in tested environments.
Abstract
Advancements in reinforcement learning have produced a variety of complex and useful intrinsic driving forces; crucially, these drivers operate under a direct conditioning paradigm. This form of conditioning limits our agents' capacity by restricting how they learn from the environment as well as from others. Off-policy or learn-by-example methods can learn from demonstrators' representations, but they require access to the demonstrating agent's policies or their reward functions. Our work overcomes this direct sampling limitation by introducing vicarious conditioning as an intrinsic reward mechanism. We draw from psychological and biological literature to provide a foundation for vicarious conditioning and use memory-based methods to implement its four steps: attention, retention, reproduction, and reinforcement. Crucially, our vicarious conditioning paradigms support low-shot learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
