C2G-KD: PCA-Constrained Generator for Data-Free Knowledge Distillation

Magnus Bengtsson; Kenneth \"Ostberg

arXiv:2507.18533·cs.LG·July 25, 2025

C2G-KD: PCA-Constrained Generator for Data-Free Knowledge Distillation

Magnus Bengtsson, Kenneth \"Ostberg

PDF

Open Access

TL;DR

C2G-KD is a data-free knowledge distillation method that uses a PCA-constrained generator to produce synthetic data guided by a teacher model, enabling effective training without real data.

Contribution

It introduces a PCA-based geometric constraint for synthetic data generation in data-free knowledge distillation, improving diversity and topological consistency.

Findings

01

Effective on MNIST with minimal class data

02

Preserves topological structure of classes

03

Generates useful synthetic samples for distillation

Abstract

We introduce C2G-KD, a data-free knowledge distillation framework where a class-conditional generator is trained to produce synthetic samples guided by a frozen teacher model and geometric constraints derived from PCA. The generator never observes real training data but instead learns to activate the teacher's output through a combination of semantic and structural losses. By constraining generated samples to lie within class-specific PCA subspaces estimated from as few as two real examples per class, we preserve topological consistency and diversity. Experiments on MNIST show that even minimal class structure is sufficient to bootstrap useful synthetic training pipelines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications