Mechanistic Analysis of Circuit Preservation in Federated Learning

Muhammad Haseeb; Salaar Masood; Muhammad Abdullah Sohail

arXiv:2512.23043·cs.LG·December 30, 2025

Mechanistic Analysis of Circuit Preservation in Federated Learning

Muhammad Haseeb, Salaar Masood, Muhammad Abdullah Sohail

PDF

Open Access

TL;DR

This paper uses mechanistic interpretability to analyze how non-IID data causes circuit degradation in federated learning, revealing that conflicting client updates lead to the collapse of class-specific sub-networks.

Contribution

It introduces a mechanistic interpretability approach to diagnose circuit collapse in federated learning under non-IID data conditions, providing the first concrete evidence of circuit divergence.

Findings

01

Non-IID data causes local circuits to diverge.

02

Circuit collapse correlates with performance degradation.

03

Mechanistic analysis offers new insights into FL failures.

Abstract

Federated Learning (FL) enables collaborative training of models on decentralized data, but its performance degrades significantly under Non-IID (non-independent and identically distributed) data conditions. While this accuracy loss is well-documented, the internal mechanistic causes remain a black box. This paper investigates the canonical FedAvg algorithm through the lens of Mechanistic Interpretability (MI) to diagnose this failure mode. We hypothesize that the aggregation of conflicting client updates leads to circuit collapse, the destructive interference of functional, sparse sub-networks responsible for specific class predictions. By training inherently interpretable, weight-sparse neural networks within an FL framework, we identify and track these circuits across clients and communication rounds. Using Intersection-over-Union (IoU) to quantify circuit preservation, we provide…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Privacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning