Visionary: Vision architecture discovery for robot learning

Iretiayo Akinola; Anelia Angelova; Yao Lu; Yevgen Chebotar; Dmitry; Kalashnikov; Jacob Varley; Julian Ibarz; Michael S. Ryoo

arXiv:2103.14633·cs.RO·March 29, 2021

Visionary: Vision architecture discovery for robot learning

Iretiayo Akinola, Anelia Angelova, Yao Lu, Yevgen Chebotar, Dmitry, Kalashnikov, Jacob Varley, Julian Ibarz, Michael S. Ryoo

PDF

TL;DR

Visionary introduces a novel architecture search method that automatically designs neural networks for robot manipulation, improving success rates and grasping performance through learned attention mechanisms.

Contribution

It presents the first successful neural architecture and attention connectivity search tailored for real-robot manipulation tasks.

Findings

01

Achieves higher task success rates compared to baselines.

02

Improves grasping performance by 6% on real robots.

03

Demonstrates effective architecture discovery during training.

Abstract

We propose a vision-based architecture search algorithm for robot manipulation learning, which discovers interactions between low dimension action inputs and high dimensional visual inputs. Our approach automatically designs architectures while training on the task - discovering novel ways of combining and attending image feature representations with actions as well as features from previous layers. The obtained new architectures demonstrate better task success rates, in some cases with a large margin, compared to a recent high performing baseline. Our real robot experiments also confirm that it improves grasping performance by 6%. This is the first approach to demonstrate a successful neural architecture search and attention connectivity search for a real-robot task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.