Efficient Imitation Without Demonstrations via Value-Penalized Auxiliary Control from Examples

Trevor Ablett; Bryan Chan; Jayce Haoran Wang; Jonathan Kelly

arXiv:2407.03311·cs.RO·September 16, 2025

Efficient Imitation Without Demonstrations via Value-Penalized Auxiliary Control from Examples

Trevor Ablett, Bryan Chan, Jayce Haoran Wang, Jonathan Kelly

PDF

Open Access 1 Repo

TL;DR

VPACE is a novel reinforcement learning algorithm that enhances exploration and learning efficiency by leveraging auxiliary tasks and value penalties from examples, outperforming traditional methods in robotic tasks.

Contribution

The paper introduces VPACE, a new method that improves sample efficiency in imitation learning without requiring full demonstrations or sparse rewards.

Findings

01

Significantly improves learning efficiency in robotic environments

02

Maintains bounded value estimates during training

03

Potentially more efficient than full-trajectory or sparse reward methods

Abstract

Common approaches to providing feedback in reinforcement learning are the use of hand-crafted rewards or full-trajectory expert demonstrations. Alternatively, one can use examples of completed tasks, but such an approach can be extremely sample inefficient. We introduce value-penalized auxiliary control from examples (VPACE), an algorithm that significantly improves exploration in example-based control by adding examples of simple auxiliary tasks and an above-success-level value penalty. Across both simulated and real robotic environments, we show that our approach substantially improves learning efficiency for challenging tasks, while maintaining bounded value estimates. Preliminary results also suggest that VPACE may learn more efficiently than the more common approaches of using full trajectories or true sparse rewards. Project site: https://papers.starslab.ca/vpace/.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

utiasSTARS/vpace
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Logic, Reasoning, and Knowledge