IMLE Policy: Fast and Sample Efficient Visuomotor Policy Learning via   Implicit Maximum Likelihood Estimation

Krishan Rana; Robert Lee; David Pershouse; Niko Suenderhauf

arXiv:2502.12371·cs.RO·March 12, 2025

IMLE Policy: Fast and Sample Efficient Visuomotor Policy Learning via Implicit Maximum Likelihood Estimation

Krishan Rana, Robert Lee, David Pershouse, Niko Suenderhauf

PDF

Open Access

TL;DR

IMLE Policy introduces a data-efficient, fast, and simple imitation learning method for visuomotor tasks that outperforms existing approaches in low-data and real-time scenarios.

Contribution

The paper presents IMLE Policy, a novel behaviour cloning approach that achieves high performance with less data and faster inference by using implicit maximum likelihood estimation.

Findings

01

Requires 38% less data than baselines.

02

Improves inference speed by 97.3% over diffusion policies.

03

Effectively learns complex multi-modal behaviors in robotics.

Abstract

Recent advances in imitation learning, particularly using generative modelling techniques like diffusion, have enabled policies to capture complex multi-modal action distributions. However, these methods often require large datasets and multiple inference steps for action generation, posing challenges in robotics where the cost for data collection is high and computation resources are limited. To address this, we introduce IMLE Policy, a novel behaviour cloning approach based on Implicit Maximum Likelihood Estimation (IMLE). IMLE Policy excels in low-data regimes, effectively learning from minimal demonstrations and requiring 38\% less data on average to match the performance of baseline methods in learning complex multi-modal behaviours. Its simple generator-based architecture enables single-step action generation, improving inference speed by 97.3\% compared to Diffusion Policy, while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeonatal and fetal brain pathology · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning

MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings