Plex: Towards Reliability using Pretrained Large Model Extensions
Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark, Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim, G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas, Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan

TL;DR
This paper introduces Plex, a set of pretrained large model extensions for vision and language that significantly enhance reliability across diverse decision-making tasks, outperforming previous methods without extensive tuning.
Contribution
The paper presents Plex, novel pretrained model extensions for vision and language that improve reliability across multiple tasks and simplify evaluation protocols.
Findings
Plex achieves state-of-the-art reliability performance.
Scaling model size and data improves reliability.
Effective on zero-shot open set recognition and active learning.
Abstract
A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive performance but also performs well consistently over many decision-making tasks involving uncertainty (e.g., selective prediction, open set recognition), robust generalization (e.g., accuracy and proper scoring rules such as log-likelihood on in- and out-of-distribution datasets), and adaptation (e.g., active learning, few-shot uncertainty). We devise 10 types of tasks over 40 datasets in order to evaluate different aspects of reliability on both vision and language domains. To improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning
