Revisiting Energy Based Models as Policies: Ranking Noise Contrastive   Estimation and Interpolating Energy Models

Sumeet Singh; Stephen Tu; Vikas Sindhwani

arXiv:2309.05803·cs.RO·September 13, 2023·2 cites

Revisiting Energy Based Models as Policies: Ranking Noise Contrastive Estimation and Interpolating Energy Models

Sumeet Singh, Stephen Tu, Vikas Sindhwani

PDF

Open Access

TL;DR

This paper demonstrates that energy-based models can be effectively trained as policies for robotic tasks, challenging previous beliefs about their impracticality, and shows they can outperform diffusion models in complex benchmarks.

Contribution

We develop a practical, asymptotically consistent training algorithm for energy-based models as policies, incorporating ranking noise contrastive estimation and learnable negative sampling.

Findings

01

Energy-based models can be trained effectively for policy representation.

02

Our method outperforms diffusion models in multi-modal robotic benchmarks.

03

The proposed approach is mathematically justified and scalable.

Abstract

A crucial design decision for any robot learning pipeline is the choice of policy representation: what type of model should be used to generate the next set of robot actions? Owing to the inherent multi-modal nature of many robotic tasks, combined with the recent successes in generative modeling, researchers have turned to state-of-the-art probabilistic models such as diffusion models for policy representation. In this work, we revisit the choice of energy-based models (EBM) as a policy class. We show that the prevailing folklore -- that energy models in high dimensional continuous spaces are impractical to train -- is false. We develop a practical training objective and algorithm for energy models which combines several key ingredients: (i) ranking noise contrastive estimation (R-NCE), (ii) learnable negative samplers, and (iii) non-adversarial joint training. We prove that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion