Learning on the Job: Self-Rewarding Offline-to-Online Finetuning for Industrial Insertion of Novel Connectors from Vision
Ashvin Nair, Brian Zhu, Gokul Narayanan, Eugen Solowjow, Sergey Levine

TL;DR
This paper proposes a method for robot fine-tuning in industrial connector insertion tasks by leveraging a generalized reward function learned from diverse examples, enabling effective adaptation to unseen connectors in unstructured environments.
Contribution
It introduces a self-rewarding offline-to-online finetuning approach that uses a generalized reward function to adapt policies to new, unseen connector insertion tasks in unstructured settings.
Findings
Reward function generalizes better than policies to new connectors.
Pretraining on 50 connectors enables successful finetuning on new connectors.
Method works effectively in real-world unstructured environments.
Abstract
Learning-based methods in robotics hold the promise of generalization, but what can be done if a learned policy does not generalize to a new situation? In principle, if an agent can at least evaluate its own success (i.e., with a reward classifier that generalizes well even when the policy does not), it could actively practice the task and finetune the policy in this situation. We study this problem in the setting of industrial insertion tasks, such as inserting connectors in sockets and setting screws. Existing algorithms rely on precise localization of the connector or socket and carefully managed physical setups, such as assembly lines, to succeed at the task. But in unstructured environments such as homes or even some industrial settings, robots cannot rely on precise localization and may be tasked with previously unseen connectors. Offline reinforcement learning on a variety of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning
