Dynamic Distillation and Gradient Consistency for Robust Long-Tailed Incremental Learning

Taigo Sakai; Kazuhiro Hotta

arXiv:2605.03364·cs.CV·May 6, 2026

Dynamic Distillation and Gradient Consistency for Robust Long-Tailed Incremental Learning

Taigo Sakai, Kazuhiro Hotta

PDF

TL;DR

This paper introduces a novel approach for long-tailed class incremental learning that combines gradient consistency regularization with adaptive distillation loss weighting, improving accuracy and robustness in imbalanced, sequential learning scenarios.

Contribution

The paper proposes a new method that stabilizes training and balances knowledge retention and acquisition in long-tailed incremental learning, outperforming existing techniques.

Findings

01

Achieves up to 5.0% accuracy improvement on benchmarks.

02

Demonstrates robustness in 'In-ordered' task setting.

03

Maintains low computational overhead.

Abstract

The task of Long-tailed Class Incremental Learning (LT-CIL) addresses the sequential learning of new classes from datasets with imbalanced class distributions. This scenario intensifies the fundamental problem of catastrophic forgetting, inherent to continual learning, with the dual challenges of under-learning minority classes and overfitting majority classes. To tackle these combined issues, this paper proposes two main techniques. First, we introduce gradient consistency regularization, which leverages the moving average of gradients to suppress abrupt fluctuations and stabilize the training process. Second, we dynamically adjust the weight of the distillation loss by measuring the degree of class imbalance with normalized entropy. This adaptive weighting establishes an optimal balance between retaining old knowledge and acquiring new information. Experiments on the CIFAR-100-LT,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.