CAME: Contrastive Automated Model Evaluation

Ru Peng; Qiuyang Duan; Haobo Wang; Jiachen Ma; Yanbo Jiang; Yongjun; Tu; Xiu Jiang; Junbo Zhao

arXiv:2308.11111·cs.CV·August 23, 2023

CAME: Contrastive Automated Model Evaluation

Ru Peng, Qiuyang Duan, Haobo Wang, Jiachen Ma, Yanbo Jiang, Yongjun, Tu, Xiu Jiang, Junbo Zhao

PDF

Open Access 1 Repo

TL;DR

CAME introduces a training-set-independent AutoEval framework that uses contrastive loss to predict model performance on unlabeled data, achieving state-of-the-art results.

Contribution

It proposes a novel AutoEval method that eliminates reliance on training data by leveraging contrastive loss and theoretical analysis.

Findings

01

CAME outperforms previous AutoEval methods significantly.

02

It establishes a predictable relationship between contrastive loss and model performance.

03

CAME achieves new state-of-the-art results in AutoEval.

Abstract

The Automated Model Evaluation (AutoEval) framework entertains the possibility of evaluating a trained machine learning model without resorting to a labeled testing set. Despite the promise and some decent results, the existing AutoEval methods heavily rely on computing distribution shifts between the unlabelled testing set and the training set. We believe this reliance on the training set becomes another obstacle in shipping this technology to real-world ML development. In this work, we propose Contrastive Automatic Model Evaluation (CAME), a novel AutoEval framework that is rid of involving training set in the loop. The core idea of CAME bases on a theoretical analysis which bonds the model performance with a contrastive loss. Further, with extensive empirical validation, we manage to set up a predictable relationship between the two, simply by deducing on the unlabeled/unseen testing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pengr/contrastive_autoeval
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms