The Unsurprising Effectiveness of Pre-Trained Vision Models for Control

Simone Parisi; Aravind Rajeswaran; Senthil Purushwalkam; Abhinav Gupta

arXiv:2203.03580·cs.CV·August 10, 2022·21 cites

The Unsurprising Effectiveness of Pre-Trained Vision Models for Control

Simone Parisi, Aravind Rajeswaran, Senthil Purushwalkam, Abhinav Gupta

PDF

Open Access

TL;DR

Pre-trained visual representations from large-scale datasets can effectively replace ground-truth states for training control policies across various domains, challenging the traditional reliance on in-domain data.

Contribution

This paper demonstrates that off-the-shelf pre-trained vision models can be competitive or superior to ground-truth states for control, with extensive empirical evaluation across multiple environments.

Findings

01

Pre-trained visual features often outperform ground-truth states in control tasks.

02

Out-of-domain vision data can be effectively used for control policy training.

03

Representation training methods and data augmentations significantly impact performance.

Abstract

Recent years have seen the emergence of pre-trained representations as a powerful abstraction for AI applications in computer vision, natural language, and speech. However, policy learning for control is still dominated by a tabula-rasa learning paradigm, with visuo-motor policies often trained from scratch using data from deployment environments. In this context, we revisit and study the role of pre-trained visual representations for control, and in particular representations trained on large-scale computer vision datasets. Through extensive empirical evaluation in diverse control domains (Habitat, DeepMind Control, Adroit, Franka Kitchen), we isolate and study the importance of different representation training methods, data augmentations, and feature hierarchies. Overall, we find that pre-trained visual representations can be competitive or even better than ground-truth state…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)