Transferring Knowledge for Reinforcement Learning in Contact-Rich Manipulation
Quantao Yang, Johannes A. Stork, and Todor Stoyanov

TL;DR
This paper introduces a method for transferring knowledge in reinforcement learning for contact-rich manipulation tasks by leveraging multiple skill priors, improving generalization to new similar tasks.
Contribution
It proposes a novel approach to learn and compose skill priors from demonstrated trajectories to enhance transferability in reinforcement learning for manufacturing tasks.
Findings
Better generalization to unseen tasks in peg-in-hole insertion experiments
Effective composition of skill priors improves learning efficiency
Demonstrated success in contact-rich manipulation scenarios
Abstract
In manufacturing, assembly tasks have been a challenge for learning algorithms due to variant dynamics of different environments. Reinforcement learning (RL) is a promising framework to automatically learn these tasks, yet it is still not easy to apply a learned policy or skill, that is the ability of solving a task, to a similar environment even if the deployment conditions are only slightly different. In this paper, we address the challenge of transferring knowledge within a family of similar tasks by leveraging multiple skill priors. We propose to learn prior distribution over the specific skill required to accomplish each task and compose the family of skill priors to guide learning the policy for a new task by comparing the similarity between the target task and the prior ones. Our method learns a latent action space representing the skill embedding from demonstrated trajectories…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Manufacturing Process and Optimization
