CHEQ-ing the Box: Safe Variable Impedance Learning for Robotic Polishing
Emma Cramer, Lukas J\"aschke, Sebastian Trimpe

TL;DR
This paper demonstrates the effectiveness of the hybrid RL algorithm CHEQ for robotic polishing with variable impedance, showing safe, efficient learning directly on hardware in contact-rich tasks.
Contribution
It provides the first hardware evaluation of adaptive hybrid RL, specifically CHEQ, for robotic polishing with variable impedance, highlighting its practicality and safety.
Findings
CHEQ achieves effective polishing with only eight hours of training.
CHEQ incurs just five failures during hardware training.
Variable impedance improves polishing performance in simulation.
Abstract
Robotic systems are increasingly employed for industrial automation, with contact-rich tasks like polishing requiring dexterity and compliant behaviour. These tasks are difficult to model, making classical control challenging. Deep reinforcement learning (RL) offers a promising solution by enabling the learning of models and control policies directly from data. However, its application to real-world problems is limited by data inefficiency and unsafe exploration. Adaptive hybrid RL methods blend classical control and RL adaptively, combining the strengths of both: structure from control and learning from RL. This has led to improvements in data efficiency and exploration safety. However, their potential for hardware applications remains underexplored, with no evaluations on physical systems to date. Such evaluations are critical to fully assess the practicality and effectiveness of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Surface Polishing Techniques · Robot Manipulation and Learning · Soft Robotics and Applications
