How Much Data Is Enough? Uniform Convergence Bounds for Generative & Vision-Language Models under Low-Dimensional Structure

Paul M. Thompson

arXiv:2512.23109·cs.LG·December 30, 2025

How Much Data Is Enough? Uniform Convergence Bounds for Generative & Vision-Language Models under Low-Dimensional Structure

Paul M. Thompson

PDF

Open Access

TL;DR

This paper establishes finite-sample uniform convergence bounds for generative and vision-language models under low-dimensional structure, providing insights into data requirements for reliable predictions in biomedical applications.

Contribution

It introduces non-asymptotic uniform convergence bounds for VLMs assuming low-dimensional semantic structures, linking spectral properties to data needs.

Findings

01

Bounds depend on intrinsic dimension, not ambient space

02

Spectrum decay influences data requirements

03

Current datasets may suffice for uniform reliability in biomedicine

Abstract

Modern generative and vision-language models (VLMs) are increasingly used in scientific and medical decision support, where predicted probabilities must be both accurate and well calibrated. Despite strong empirical results with moderate data, it remains unclear when such predictions generalize uniformly across inputs, classes, or subpopulations, rather than only on average-a critical issue in biomedicine, where rare conditions and specific groups can exhibit large errors even when overall loss is low. We study this question from a finite-sample perspective and ask: under what structural assumptions can generative and VLM-based predictors achieve uniformly accurate and calibrated behavior with practical sample sizes? Rather than analyzing arbitrary parameterizations, we focus on induced families of classifiers obtained by varying prompts or semantic embeddings within a restricted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning