Theia: Distilling Diverse Vision Foundation Models for Robot Learning
Jinghuan Shang, Karl Schmeckpeper, Brandon B. May, Maria Vittoria, Minniti, Tarik Kelestemur, David Watkins, and Laura Herlant

TL;DR
Theia is a novel vision foundation model for robot learning that distills knowledge from multiple models trained on diverse visual tasks, leading to improved performance with less data.
Contribution
We introduce Theia, a new approach that combines multiple off-the-shelf vision models to create a comprehensive visual representation for robot learning.
Findings
Theia outperforms individual teacher models in robot tasks.
Theia requires less training data and smaller models.
Higher entropy in feature norms correlates with better robot learning results.
Abstract
Vision-based robot policy learning, which maps visual inputs to actions, necessitates a holistic understanding of diverse visual tasks beyond single-task needs like classification or segmentation. Inspired by this, we introduce Theia, a vision foundation model for robot learning that distills multiple off-the-shelf vision foundation models trained on varied vision tasks. Theia's rich visual representations encode diverse visual knowledge, enhancing downstream robot learning. Extensive experiments demonstrate that Theia outperforms its teacher models and prior robot learning models using less training data and smaller model sizes. Additionally, we quantify the quality of pre-trained visual representations and hypothesize that higher entropy in feature norm distributions leads to improved robot learning performance. Code, models, and demo are available at https://theia.theaiinstitute.com.
Peer Reviews
Decision·CoRL 2024
Code & Models
- 🤗theaiinstitute/theia-tiny-patch16-224-cdivmodel· 1.0k dl· ♡ 41.0k dl♡ 4
- 🤗theaiinstitute/theia-small-patch16-224-cdivmodel· 304 dl· ♡ 3304 dl♡ 3
- 🤗theaiinstitute/theia-base-patch16-224-cdivmodel· 3.0k dl· ♡ 93.0k dl♡ 9
- 🤗theaiinstitute/theia-tiny-patch16-224-cddsvmodel· 4.7k dl· ♡ 44.7k dl♡ 4
- 🤗theaiinstitute/theia-base-patch16-224-cddsvmodel· 1.0k dl· ♡ 21.0k dl♡ 2
- 🤗theaiinstitute/theia-small-patch16-224-cddsvmodel· 95 dl95 dl
- 🤗229nagibator229/theia-tiny-patch16-224-cdivmodel· 4 dl4 dl
- 🤗229nagibator229/theia-small-patch16-224-cdivmodel· 1 dl1 dl
- 🤗229nagibator229/theia-base-patch16-224-cdivmodel· 1 dl1 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms
