Accurate Forgetting for Heterogeneous Federated Continual Learning
Abudukelimu Wuerkaixi, Sen Cui, Jingfeng Zhang, Kunda Yan, Bo Han,, Gang Niu, Lei Fang, Changshui Zhang, Masashi Sugiyama

TL;DR
This paper introduces a novel federated continual learning method that leverages accurate forgetting to selectively utilize previous knowledge, addressing heterogeneity and antagonistic data across clients.
Contribution
It proposes the concept of accurate forgetting and a generative-replay method using normalizing flows to improve federated continual learning in heterogeneous settings.
Findings
Our method outperforms existing baselines in experiments.
Selective forgetting helps mitigate bias from spurious correlations.
Normalizing flow models effectively quantify knowledge credibility.
Abstract
Recent years have witnessed a burgeoning interest in federated learning (FL). However, the contexts in which clients engage in sequential learning remain under-explored. Bridging FL and continual learning (CL) gives rise to a challenging practical problem: federated continual learning (FCL). Existing research in FCL primarily focuses on mitigating the catastrophic forgetting issue of continual learning while collaborating with other clients. We argue that the forgetting phenomena are not invariably detrimental. In this paper, we consider a more practical and challenging FCL setting characterized by potentially unrelated or even antagonistic data/tasks across different clients. In the FL scenario, statistical heterogeneity and data noise among clients may exhibit spurious correlations which result in biased feature learning. While existing CL strategies focus on a complete utilization of…
Peer Reviews
Decision·ICLR 2024 poster
- the FCL definition used here allows for a broad set of continual learning problems in FL scenarios, including scenarios with unrelated or even contradictory tasks (termed Limtless Task Pool (LTP)) - the problem is relevant, interesting and well motivated (especially in Sec. 4) - the paper is written clearly in general. The problem definition as well as the methodogical contribution is clearly elaborated on - it is straightforward to follow through the paper - experiments show the effectiveness
# Section 4 While the problem shown in Sec. 4 sufficiently supports the parts of the motivation of the paper, it lacks to clearly support the claim that biases in the data have severe impact on the model performance in FCL. However, related work already indicated that this is the case, making an explicit empirical validation somewhat obsolete. Nevertheless it would be great to have some references to fully cover the claims in this section as well. # Section 5 - "Besides, learning in feature spa
1. Rather than preventing forgetting in continual learning settings, the authors introduce an interesting concept that forgetting is crucial even in these settings. They propose a method that accurately forgets heterogeneous or malign information by assigning lower weights to certain generated feature vectors. 2. They employ a normalizing flow model to retain previous knowledge through distribution in feature space. 3. The results, supported by ablation studies, highlight the effectiveness and s
1. In Section 4.2's experiment, noise interference is limited to the initial three tasks. This doesn't adequately assess the impact on each method when the noise occurs at random or during intermediate stages, which would be a more general scenario. 2. The problems in Section 4.2 appear to overlap with those discussed in studies on Concept Drifts in Federated Learning (FL). To emphasize the importance and influence of this research, it would be beneficial to distinguish it from Concept Drifts i
1. The methodology is solid, successfully excluding biased features from the memory bank. 2. The experiments are sufficient.
1. The problem formulation is confusing. In section 3.1 and most parts of section 3.2, the authors explain federated continual learning and a limitless task pool in detail, which is irrelevant to the methodology section. If I understand correctly, the main contribution of this work is disentangling and removing biased harmful features, while FCL merely serving as a relevant scenario. It would be better if the authors introduced the FCL formulation briefly and focused on the biased features in th
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeophysical Methods and Applications · Domain Adaptation and Few-Shot Learning
MethodsFocus
