Fixed Anchors Are Not Enough: Dynamic Retrieval and Persistent Homology for Dataset Distillation

Muquan Li; Hang Gou; Yingyi Ma; Rongzheng Wang; Ke Qin; Tao He

arXiv:2602.24144·cs.CV·March 18, 2026

Fixed Anchors Are Not Enough: Dynamic Retrieval and Persistent Homology for Dataset Distillation

Muquan Li, Hang Gou, Yingyi Ma, Rongzheng Wang, Ke Qin, Tao He

PDF

Open Access

TL;DR

This paper introduces RETA, a novel dataset distillation framework combining dynamic patch retrieval and persistent homology regularization to improve synthetic data quality and generalization across multiple datasets.

Contribution

RETA integrates dynamic retrieval and topology alignment to address static patch limitations in dataset distillation, enhancing diversity and performance.

Findings

01

RETA outperforms baselines on CIFAR-100, Tiny-ImageNet, and ImageNet-1K.

02

Achieves 64.3% top-1 accuracy on ImageNet-1K with 50 images per class.

03

Consistently improves generalization and efficiency in dataset distillation.

Abstract

Decoupled dataset distillation (DD) compresses large corpora into a few synthetic images by matching a frozen teacher's statistics. However, current residual-matching pipelines rely on static real patches, creating a fit-complexity gap and a pull-to-anchor effect that reduce intra-class diversity and hurt generalization. To address these issues, we introduce RETA -- a Retrieval and Topology Alignment framework for decoupled DD. First, Dynamic Retrieval Connection (DRC) selects a real patch from a prebuilt pool by minimizing a fit-complexity score in teacher feature space; the chosen patch is injected via a residual connection to tighten feature fit while controlling injected complexity. Second, Persistent Topology Alignment (PTA) regularizes synthesis with persistent homology: we build a mutual k-NN feature graph, compute persistence images of components and loops, and penalize topology…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopological and Geometric Data Analysis · Advanced Graph Neural Networks · Domain Adaptation and Few-Shot Learning