Dataset Distillation via Committee Voting

Jiacheng Cui; Zhaoyi Li; Xiaochen Ma; Xinyue Bi; Yaxin Luo; Zhiqiang Shen

arXiv:2501.07575·cs.CV·February 17, 2026

Dataset Distillation via Committee Voting

Jiacheng Cui, Zhaoyi Li, Xiaochen Ma, Xinyue Bi, Yaxin Luo, Zhiqiang Shen

PDF

1 Repo 3 Reviews

TL;DR

This paper introduces CV-DD, a novel dataset distillation method that uses committee voting among multiple models to generate high-quality, diverse, and robust synthetic data, improving generalization and transfer performance.

Contribution

The paper proposes a new committee voting approach for dataset distillation that leverages multiple models to enhance data quality and robustness, outperforming existing methods.

Findings

01

CV-DD achieves state-of-the-art performance across multiple datasets.

02

The method improves generalization and transferability.

03

It reduces model bias and overfitting in dataset distillation.

Abstract

Dataset distillation aims to synthesize a compact yet representative dataset that preserves the essential characteristics of the original data for efficient model training. Existing methods mainly focus on improving data-synthetic alignment or scaling distillation to large datasets. In this work, we propose $C$ ommittee $V$ oting for $D$ ataset $D$ istillation ( $CV-DD$ ), an orthogonal approach that leverages the collective knowledge of multiple models to produce higher-quality distilled data. We first establish a strong baseline that achieves state-of-the-art performance through modern architectural and optimization choices. By integrating distributions and predictions from multiple models and generating high-quality soft labels, our method captures a broader range of data characteristics, reduces model-specific bias and the impact of…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 3

Strengths

1. Ensemble-based methods in dataset distillation remain an emerging area of research. 2. The performance improvements reported in this paper are significant.

Weaknesses

1. Although the proposed CV-DD framework enhances cross-model generalization by leveraging diverse architectures, its Batch-Specific Soft Labeling (BSSL) mechanism constrains the method to architectures that include Batch Normalization (BN) layers. This dependency limits the generalization and versatility of the approach when applied to models without BN components. 2. The visualization in Figure 6 suffers from poor distinguishability between curves due to the use of similar colors and marker s

Reviewer 02Rating 6Confidence 3

Strengths

1. Committee voting is a novel approach for dataset distillation that improves data representativeness through prior performance–based weighting. 2. CV-DD achieves state-of-the-art results across datasets, e.g., 59.5% on ImageNet-1K (IPC=50) versus 56.5% for RDED, with strong cross-architecture generalization. 3. Ablations confirm the effectiveness of voting temperature, prior-guided voting, and BSSL in reducing distribution shift.

Weaknesses

**Major:** 1. A large portion of the paper focuses on building a strong baseline SRe2L++, which already incorporates several optimizations from recent SOTA methods such as EDC and RDED. Since SRe2L++ itself performs at a very high level, the additional gain from the committee voting mechanism appears modest, reducing the overall sense of novelty. 2. Although the method reports better per-iteration efficiency than G-VBSM, the overall training pipeline is complex. It requires pretraining all comm

Reviewer 03Rating 2Confidence 4

Strengths

The overall presentation quality is good. The figures are visually clear and help the reader understand the methodology. The motivation of performance-guided voting is intuitively reasonable in principle.

Weaknesses

The proposed method appears to be a marginal modification over SRe²L. The core contribution, i.e., “Prior-based Voting”, is computationally expensive yet results in only marginal improvements, as shown in Table 4 (middle). Several enhanced training tricks are adopted only for the proposed method and SRe²L++, like "Smoothed Learning Rate & Smaller Batch Size", which is not applied to other competing methods, leading to potentially unfair comparisons. The comparison methods are limited. The main

Code & Models

Repositories

jiacheng8/cv-dd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.