SLowRL: Safe Low-Rank Adaptation Reinforcement Learning for Locomotion
Elham Daneshmand, Shafeef Omar, Glen Berseth, Majid Khadiv, Hsiu-Chin Lin

TL;DR
This paper introduces SLowRL, a safe and efficient reinforcement learning framework that combines low-rank adaptation and safety enforcement to improve sim-to-real transfer for robotic locomotion.
Contribution
It presents SLowRL, a novel method integrating Low-Rank Adaptation with safety recovery policies for safe, rapid real-world fine-tuning of locomotion policies.
Findings
Achieves 46.5% reduction in fine-tuning time.
Near-zero safety violations during real-world adaptation.
Rank-1 adaptation suffices for performance recovery.
Abstract
Sim-to-real transfer of locomotion policies often leads to performance degradation due to the inevitable sim-to-real gap. Naively fine-tuning these policies directly on hardware is problematic, as it poses risks of mechanical failure and suffers from high sample inefficiency. In this paper, we address the challenge of safely and efficiently fine-tuning reinforcement learning (RL) policies for dynamic locomotion tasks. Specifically, we focus on fine-tuning policies learned in simulation directly on hardware, while explicitly enforcing safety constraints. In doing so, we introduce SLowRL, a framework that combines Low-Rank Adaptation (LoRA) with training-time safety enforcement via a recovery policy. We evaluate our method both in simulation and on a real Unitree Go2 quadruped robot for jump and trot tasks. Experimental results show that our method achieves a reduction in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Reinforcement Learning in Robotics · Prosthetics and Rehabilitation Robotics
