A Generalization Theory for Zero-Shot Prediction

Ronak Mehta; Zaid Harchaoui

arXiv:2507.09128·stat.ML·September 3, 2025

A Generalization Theory for Zero-Shot Prediction

Ronak Mehta, Zaid Harchaoui

PDF

Open Access 1 Video

TL;DR

This paper introduces a theoretical framework to understand zero-shot prediction in machine learning, focusing on the underlying representations and independence conditions that enable models to generalize without labeled data.

Contribution

It provides a formal analysis of zero-shot prediction, identifying the key quantities and independence relationships that facilitate its generalization capabilities.

Findings

01

Defines the target quantities for zero-shot prediction

02

Identifies key conditional independence relationships

03

Provides insights into the generalization ability of foundation models

Abstract

A modern paradigm for generalization in machine learning and AI consists of pre-training a task-agnostic foundation model, generally obtained using self-supervised and multimodal contrastive learning. The resulting representations can be used for prediction on a downstream task for which no labeled data is available. We present a theoretical framework to better understand this approach, called zero-shot prediction. We identify the target quantities that zero-shot prediction aims to learn, or learns in passing, and the key conditional independence relationships that enable its generalization ability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Generalization Theory for Zero-Shot Prediction· slideslive

Taxonomy

TopicsMedical Imaging Techniques and Applications · Advanced X-ray and CT Imaging