Probabilistic Embeddings for Frozen Vision-Language Models: Uncertainty Quantification with Gaussian Process Latent Variable Models
Aishwarya Venkataramanan, Paul Bodesheim, Joachim Denzler

TL;DR
This paper introduces GroVE, a post-hoc method that leverages Gaussian Process Latent Variable Models to generate uncertainty-aware probabilistic embeddings from frozen vision-language models, enhancing uncertainty quantification without retraining the models.
Contribution
GroVE provides a novel post-hoc approach to derive probabilistic embeddings from frozen VLMs using GPLVM, improving uncertainty calibration across multiple tasks.
Findings
Achieves state-of-the-art uncertainty calibration in downstream tasks
Effective in cross-modal retrieval, visual question answering, and active learning
Does not require retraining large-scale VLMs
Abstract
Vision-Language Models (VLMs) learn joint representations by mapping images and text into a shared latent space. However, recent research highlights that deterministic embeddings from standard VLMs often struggle to capture the uncertainties arising from the ambiguities in visual and textual descriptions and the multiple possible correspondences between images and texts. Existing approaches tackle this by learning probabilistic embeddings during VLM training, which demands large datasets and does not leverage the powerful representations already learned by large-scale VLMs like CLIP. In this paper, we propose GroVE, a post-hoc approach to obtaining probabilistic embeddings from frozen VLMs. GroVE builds on Gaussian Process Latent Variable Model (GPLVM) to learn a shared low-dimensional latent space where image and text inputs are mapped to a unified representation, optimized through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Geographic Information Systems Studies · AI-based Problem Solving and Planning
MethodsGaussian Process · Contrastive Language-Image Pre-training
