DIRECT: Learning from Sparse and Shifting Rewards using Discriminative Reward Co-Training
Philipp Altmann, Thomy Phan, Fabian Ritz, Thomas Gabor, Claudia, Linnhoff-Popien

TL;DR
DIRECT introduces a discriminative reward co-training method that leverages beneficial past trajectories and a discriminator network to improve reinforcement learning in environments with sparse and shifting rewards.
Contribution
The paper presents a novel discriminative reward co-training approach that enhances deep reinforcement learning by using a discriminator to guide policy optimization with beneficial past experiences.
Findings
Outperforms state-of-the-art algorithms in sparse-reward environments
Effectively handles shifting reward landscapes
Provides a surrogate reward to improve policy learning
Abstract
We propose discriminative reward co-training (DIRECT) as an extension to deep reinforcement learning algorithms. Building upon the concept of self-imitation learning (SIL), we introduce an imitation buffer to store beneficial trajectories generated by the policy determined by their return. A discriminator network is trained concurrently to the policy to distinguish between trajectories generated by the current policy and beneficial trajectories generated by previous policies. The discriminator's verdict is used to construct a reward signal for optimizing the policy. By interpolating prior experience, DIRECT is able to act as a surrogate, steering policy optimization towards more valuable regions of the reward landscape thus learning an optimal policy. Our results show that DIRECT outperforms state-of-the-art algorithms in sparse- and shifting-reward environments being able to provide a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Mental Health Interventions · EEG and Brain-Computer Interfaces · Functional Brain Connectivity Studies
