MoRI: Mixture of RL and IL Experts for Long-Horizon Manipulation Tasks
Yaohang Xu, Lianjie Ma, Gewei Zuo, Wentao Zhang, Han Ding, Lijun Zhu

TL;DR
MoRI combines reinforcement learning and imitation learning experts, dynamically switching between them to efficiently solve complex long-horizon manipulation tasks with minimal human intervention.
Contribution
The paper introduces MoRI, a hybrid system that adaptively switches between RL and IL experts, improving efficiency and success rates in complex manipulation tasks.
Findings
Achieves 97.5% success rate in real-world tasks
Reduces human intervention by 85.8%
Speeds up convergence by 21% compared to baseline RL
Abstract
Reinforcement Learning (RL) and Imitation Learning (IL) are the standard frameworks for policy acquisition in manipulation. While IL offers efficient policy derivation, it suffers from compounding errors and distribution shift. Conversely, RL facilitates autonomous exploration but is frequently hindered by low sample efficiency and the high cost of trial and error. Since existing hybrid methods often struggle with complex tasks, we introduce Mixture of RL and IL Experts (MoRI). This system dynamically switches between IL and RL experts based on the variance of expert actions to handle coarse movements and fine-grained manipulations. MoRI employs an offline pre-training stage followed by online fine-tuning to accelerate convergence. To maintain exploration safety and minimize human intervention, the system applies IL-based regularization to the RL component. Evaluation across four…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
