Learning Memory-Dependent Continuous Control from Demonstrations
Siqing Hou, Dongqi Han, Jun Tani

TL;DR
This paper introduces READER, a novel reinforcement learning algorithm that leverages demonstrations and experience replay to improve memory-dependent continuous control in partially observable environments, enhancing sample efficiency.
Contribution
The paper presents READER, a new algorithm capable of handling memory-dependent control tasks using demonstrations, extending RL to partially observable environments.
Findings
Significantly reduces environment interactions.
Outperforms baseline RL algorithms in sample efficiency.
Effective in memory-critical control tasks.
Abstract
Efficient exploration has presented a long-standing challenge in reinforcement learning, especially when rewards are sparse. A developmental system can overcome this difficulty by learning from both demonstrations and self-exploration. However, existing methods are not applicable to most real-world robotic controlling problems because they assume that environments follow Markov decision processes (MDP); thus, they do not extend to partially observable environments where historical observations are necessary for decision making. This paper builds on the idea of replaying demonstrations for memory-dependent continuous control, by proposing a novel algorithm, Recurrent Actor-Critic with Demonstration and Experience Replay (READER). Experiments involving several memory-crucial continuous control tasks reveal significantly reduce interactions with the environment using our method with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robotic Path Planning Algorithms · Advanced Bandit Algorithms Research
MethodsExperience Replay
