Memory Sequence Length of Data Sampling Impacts the Adaptation of Meta-Reinforcement Learning Agents
Menglong Zhang, Fuyuan Qian, Quanying Liu

TL;DR
This paper investigates how different data sampling strategies, especially sequence length, influence the ability of meta-reinforcement learning agents to adapt to new environments, emphasizing the importance of sampling in off-policy algorithms.
Contribution
It provides an empirical analysis of how sequence length in data sampling affects meta-RL agents' exploration and adaptability, comparing Thompson sampling and Bayes-optimality based methods.
Findings
Bayes-optimality based algorithms show more robust adaptation.
Long-memory sampling improves environment representation.
Sampling strategy impacts performance in sparse reward tasks.
Abstract
Fast adaptation to new tasks is extremely important for embodied agents in the real world. Meta-reinforcement learning (meta-RL) has emerged as an effective method to enable fast adaptation in unknown environments. Compared to on-policy meta-RL algorithms, off-policy algorithms rely heavily on efficient data sampling strategies to extract and represent the historical trajectories. However, little is known about how different data sampling methods impact the ability of meta-RL agents to represent unknown environments. Here, we investigate the impact of data sampling strategies on the exploration and adaptability of meta-RL agents. Specifically, we conducted experiments with two types of off-policy meta-RL algorithms based on Thompson sampling and Bayes-optimality theories in continuous control tasks within the MuJoCo environment and sparse reward navigation tasks. Our analysis revealed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques
