Leveraging Multi-Modal Information to Enhance Dataset Distillation

Zhe Li; Hadrien Reynaud; Bernhard Kainz

arXiv:2505.08605·cs.CV·December 10, 2025

Leveraging Multi-Modal Information to Enhance Dataset Distillation

Zhe Li, Hadrien Reynaud, Bernhard Kainz

PDF

Open Access

TL;DR

This paper introduces a multi-modal dataset distillation method that combines visual and textual information with object-centric masking to produce compact, privacy-preserving synthetic datasets with improved utility.

Contribution

It proposes a novel multi-modal framework with caption-guided supervision and object-centric masking, enhancing dataset distillation beyond visual-only approaches.

Findings

01

Improves downstream task performance.

02

Enhances privacy by reducing real data exposure.

03

Achieves better object-focused data representation.

Abstract

Dataset distillation aims to create a small and highly representative synthetic dataset that preserves the essential information of a larger real dataset. Beyond reducing storage and computational costs, related approaches offer a promising avenue for privacy preservation in computer vision by eliminating the need to store or share sensitive real-world images. Existing methods focus solely on optimizing visual representations, overlooking the potential of multi-modal information. In this work, we propose a multi-modal dataset distillation framework that incorporates two key enhancements: caption-guided supervision and object-centric masking. To leverage textual information, we introduce two strategies: caption concatenation, which fuses caption embeddings with visual features during classification, and caption matching, which enforces semantic alignment between real and synthetic data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Neural Networks and Applications

MethodsFocus