Generative Diffusion Prior Distillation for Long-Context Knowledge Transfer

Nilushika Udayangani; Kishor Nandakishor; and Marimuthu Palaniswami

arXiv:2605.11414·cs.LG·May 13, 2026

Generative Diffusion Prior Distillation for Long-Context Knowledge Transfer

Nilushika Udayangani, Kishor Nandakishor, and Marimuthu Palaniswami

PDF

1 Video

TL;DR

This paper introduces GDPD, a novel knowledge distillation framework using diffusion models to improve partial time-series classifiers by transferring full-sequence knowledge.

Contribution

GDPD leverages diffusion-based generative priors to enhance partial sequence classification, addressing generalization gaps from training-data differences.

Findings

01

GDPD improves partial classifier performance across multiple datasets.

02

The method effectively transfers long-range context knowledge.

03

Experiments show GDPD outperforms existing distillation approaches.

Abstract

While traditional time-series classifiers assume full sequences at inference, practical constraints (latency and cost) often limit inputs to partial prefixes. The absence of class-discriminative patterns in partial data can significantly hinder a classifier's ability to generalize. This work uses knowledge distillation (KD) to equip partial time series classifiers with the generalization ability of their full-sequence counterparts. In KD, high-capacity teacher transfers supervision to aid student learning on the target task. Matching with teacher features has shown promise in closing the generalization gap due to limited parameter capacity. However, when the generalization gap arises from training-data differences (full versus partial), the teacher's full-context features can be an overwhelming target signal for the student's short-context features. To provide progressive, diverse, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Generative Diffusion Prior Distillation for Long-Context Knowledge Transfer· slideslive