Rethinking Continual Learning with Progressive Neural Collapse
Zheng Wang, Wanhao Yu, Li Yang, Sen Lin

TL;DR
This paper introduces Progressive Neural Collapse (ProNC), a novel continual learning framework that dynamically expands class prototypes to improve knowledge retention and separation without relying on a fixed global ETF, leading to superior performance.
Contribution
ProNC removes the need for a fixed global ETF in continual learning by progressively expanding class prototypes, enhancing flexibility and effectiveness.
Findings
ProNC significantly outperforms existing methods in continual learning tasks.
ProNC maintains high class separability with minimal shifts from previous prototypes.
The framework is simple, efficient, and adaptable to various CL algorithms.
Abstract
Continual Learning (CL) seeks to build an agent that can continuously learn a sequence of tasks, where a key challenge, namely Catastrophic Forgetting, persists due to the potential knowledge interference among different tasks. On the other hand, deep neural networks (DNNs) are shown to converge to a terminal state termed Neural Collapse during training, where all class prototypes geometrically form a static simplex equiangular tight frame (ETF). These maximally and equally separated class prototypes make the ETF an ideal target for model learning in CL to mitigate knowledge interference. Thus inspired, several studies have emerged very recently to leverage a fixed global ETF in CL, which however suffers from key drawbacks, such as impracticability and limited performance.To address these challenges and fully unlock the potential of ETF in CL, we propose Progressive Neural Collapse…
Peer Reviews
Decision·ICLR 2026 Poster
1. The paper provides a clear exposition of the background NC theory and explains its own contributions in a well-structured manner. 2. The proposed method is well-motivated by the identified limitations of prior NC-based CL works, and the use of Theorem 1 introduces a moderately novel and theoretically grounded component. 3. Comprehensive experiments demonstrate consistent and noticeable gains over a range of baselines, supporting the empirical validity of the approach.
1. The technical novelty remains limited compared with the preliminary works (Yang et al., 2023a,b). The paper reads largely as a continuation of this prior line of research, where NC-based CL formulations have already been thoroughly explored. 2. The second main contribution—the ProNC-based CL framework—largely mirrors the loss formulation of NCT (Yang et al., 2023b). While Section 3.1 introduces a genuinely new idea, Section 3.2 appears nearly identical to the corresponding part in NCT. 3. Som
1. Grounded in Neural Collapse geometry, offering an interpretable view of feature alignment in continual learning. Achieves strong results without complex contrastive or generative modules. 2. Works as a plug-in regularizer across different CL frameworks (e.g., ER, iCaRL, DER++).
1. The method assumes clear task segmentation (task-aware setting); its applicability to task-free or online CL remains untested. 2. As the ETF expands over many tasks, orthogonality may gradually degrade; this possible effect is not analyzed experimentally. 3. Gram–Schmidt expansion could become unstable when the number of classes approaches the embedding dimension; only small-scale datasets and ResNet-18 (d ≤ 512) were tested.
The idea of progressively adapting the ETF target during continual learning without knowing the number of total classes in advance is novel and addresses the shortcomings of fixed ETF methods for CL. The paper is built on a convincing motivation. The reasoning is coherent and carefully developed, making the overall argument both logical and easy to follow.
### **Major Weaknesses** 1. **Questionable baseline performance values**: in Table 1, several baseline results (Co$^2$L, CILA, MNC$^3$L, STAR) are notably lower than those reported in their original papers (where they surpass the results from the proposed ProNC). This discrepancy indicates possible reproduction or configuration issues, undermining the fairness and credibility of the comparison and invalidating the paper’s main “state-of-the-art” claim. 2. **Missing baselines**: some important
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Cognitive Science and Education Research
