TL;DR
RoHIL is a novel offline fine-tuning framework that enhances human-in-the-loop robotic reinforcement learning robustness against illumination changes without additional real-robot data collection.
Contribution
It introduces a world-model-based image relighter, an anti-forgetting replay mechanism, and an anchored regulariser to improve cross-workstation transfer in robotic RL.
Findings
Significantly improves shifted-light performance in real-robot tasks.
Preserves source-workstation performance without re-collecting data.
Eliminates the need for retraining for new illumination environments.
Abstract
Human-in-the-loop reinforcement learning systems achieve near-perfect success on the workstation where they are trained, but collapse when the same robot is moved to a workstation a few meters away due to shifts in the visual input distribution caused by new lamp positions and window light. Re-collecting demonstrations and re-running HIL on every workstation is incompatible with deployment, and naively fine-tuning on shifted-light data triggers catastrophic forgetting of the source workstation. To close this cross-domain gap, we present RoHIL, an offline fine-tuning framework that uses no extra real-robot interaction. RoHIL combines (i) a world-model-based image relighter that re-synthesises the visual stream of source-workstation trajectories under multiple virtual HDRI environments, leaving actions and rewards real; (ii) Illumination-Retention Replay (IRR), a data-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
