DeepAFL: Deep Analytic Federated Learning
Jianheng Tang, Yajiang Huang, Kejia Fan, Feijiang Han, Jiaxu Li, Jinfeng Xu, Run He, Anfeng Liu, Houbing Herbert Song, Huiping Zhuang, Yunhuai Liu

TL;DR
DeepAFL introduces a novel federated learning method that combines analytic solutions with deep residual structures, enhancing heterogeneity invariance and representation learning, and outperforming existing methods.
Contribution
This paper presents DeepAFL, a deep analytic federated learning approach that integrates residual blocks with analytical solutions, enabling deep representation learning while maintaining heterogeneity invariance.
Findings
Outperforms state-of-the-art baselines by up to 8.42% on benchmark datasets.
Demonstrates superior heterogeneity invariance and representation learning capabilities.
Validates effectiveness through theoretical analysis and empirical evaluation.
Abstract
Federated Learning (FL) is a popular distributed learning paradigm to break down data silo. Traditional FL approaches largely rely on gradient-based updates, facing significant issues about heterogeneity, scalability, convergence, and overhead, etc. Recently, some analytic-learning-based work has attempted to handle these issues by eliminating gradient-based updates via analytical (i.e., closed-form) solutions. Despite achieving superior invariance to data heterogeneity, these approaches are fundamentally limited by their single-layer linear model with a frozen pre-trained backbone. As a result, they can only achieve suboptimal performance due to their lack of representation learning capabilities. In this paper, to enable representable analytic models while preserving the ideal invariance to data heterogeneity for FL, we propose our Deep Analytic Federated Learning approach, named…
Peer Reviews
Decision·ICLR 2026 Poster
1. The idea of adding skip connection to the multi-layer version of analytic federated learning is great. 2. The design of the residual block is nice. 3. The framework addresses well the representation learning issue of the analytic federated learning while keeping all advantages like closed-form solutions and invariance to data heterogeneity. 4. The algorithm is guaranteed to converge 5. Extensive experiments are provided to demonstrate the potential of the algorithm.
The privacy argument seems weak. Once data encryption techniques are introduced, the computational cost will be increasing as well. Minor: In Line 373-374, "and AFL (He et al., 2025). Furthermore, we include the analytic learning-based method AFL (He et al., 2025) as a baseline to further highlight our advantages", where "AFL (He et al., 2025)" is repeated.
1. The paper introduces the first federated learning framework that achieves deep representation learning without gradients while maintaining closed-form analytical updates. 2. The idea of stacking analytic layers in a residual manner is a meaningful and elegant extension of AFL. 3. Reported efficiency gains-around 99% reduction in computation and 50% reduction in communication compared to gradient-based methods-are impressive.
1. While theoretical invariance is well-supported, the practical scalability to very large models such as ViTs or LLMs has not been evaluated. 2. The related work section omits recent multimodal and representation-based FL methods like FedRep [1] and FedU² [2]. 3. No wall-clock runtime, GPU-hour, or FLOP comparisons are provided to substantiate the efficiency claims.
1. The paper is well-written and clearly structured, with equations and derivations easy to follow. The figures and notations are well explained. 2. Extending analytic learning to a deeper residual form and formulating closed-form residual updates is technically neat and easily understood. 3. This work provides sound theoretical analyses to support the heterogeneous invariance and representation learning capabilities. 4. Due to no backpropagation, this work achieves quite optimization cost re
### **1. The experimental setup is methodologically unfair.** DeepAFL aggregates global feature–label statistics and computes a closed-form model that the authors explicitly state is identical to centralized analytic learning, hence it is inherently immune to data heterogeneity. In contrast, gradient-based baselines (FedAvg, FedProx, MOON, etc.) naturally degrade under Non-IID settings. Evaluating all methods on strongly heterogeneous partitions, therefore, gives DeepAFL an artificial advantage
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Graph Neural Networks · Data Quality and Management
