Taming Preconditioner Drift: Unlocking the Potential of Second-Order Optimizers for Federated Learning on Non-IID Data

Junkang Liu; Fanhua Shang; Hongying Liu; Jin Liu; Weixin An; Yuanyuan Liu

arXiv:2602.19271·cs.LG·February 24, 2026

Taming Preconditioner Drift: Unlocking the Potential of Second-Order Optimizers for Federated Learning on Non-IID Data

Junkang Liu, Fanhua Shang, Hongying Liu, Jin Liu, Weixin An, Yuanyuan Liu

PDF

Open Access

TL;DR

This paper introduces FedPAC, a framework that aligns and corrects preconditioners in federated second-order optimization, enabling stable and accurate training on non-IID data.

Contribution

FedPAC is the first method to explicitly address preconditioner drift in federated second-order optimization, improving stability and convergence on heterogeneous data.

Findings

01

Achieves up to 5.8% accuracy gain on CIFAR-100 with ViTs.

02

Provides convergence guarantees with linear speedup under partial participation.

03

Enhances stability and accuracy across vision and language tasks.

Abstract

Second-order optimizers can significantly accelerate large-scale training, yet their naive federated variants are often unstable or even diverge on non-IID data. We show that a key culprit is \emph{preconditioner drift}: client-side second-order training induces heterogeneous \emph{curvature-defined geometries} (i.e., preconditioner coordinate systems), and server-side model averaging updates computed under incompatible metrics, corrupting the global descent direction. To address this geometric mismatch, we propose \texttt{FedPAC}, a \emph{preconditioner alignment and correction} framework for reliable federated second-order optimization. \texttt{FedPAC} explicitly decouples parameter aggregation from geometry synchronization by: (i) \textbf{Alignment} (i.e.,aggregating local preconditioners into a global reference and warm-starting clients via global preconditioner); and (ii)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning