FAIR-Pruner: A Flexible Framework for Automatic Layer-Wise Pruning via Tolerance of Difference

Chenqing Lin; Mostafa Hussien; Chengyao Yu; Bingyi Jing; Ruixing Ming; Kim Khoa Nguyen; Mohamed Cheriet

arXiv:2508.02291·cs.LG·May 21, 2026

FAIR-Pruner: A Flexible Framework for Automatic Layer-Wise Pruning via Tolerance of Difference

Chenqing Lin, Mostafa Hussien, Chengyao Yu, Bingyi Jing, Ruixing Ming, Kim Khoa Nguyen, Mohamed Cheriet

PDF

1 Repo

TL;DR

FAIR-Pruner introduces a flexible, search-free framework for adaptive layer-wise structured pruning of neural networks, optimizing accuracy and compression by measuring overlap between removal and protection signals.

Contribution

It proposes a novel Tolerance of Difference (ToD) method for non-uniform pruning, combining multiple signals and providing theoretical analysis and extensive empirical validation.

Findings

01

Achieves strong accuracy-compression trade-offs on multiple datasets and architectures.

02

Demonstrates architectural extensibility with routed-expert models.

03

Provides open-source implementation for practical use.

Abstract

Structured pruning is a standard tool for compressing deep neural networks, but its practical performance depends on how sparsity is allocated across layers. We propose FAIR-Pruner, a search-free framework for adaptive layer-wise structured pruning. FAIR-Pruner uses two within-layer rankings: a removal-oriented signal that proposes candidate units and a protection-oriented signal that identifies task-sensitive units. Its core component, Tolerance of Difference (ToD), measures the overlap between the removal prefix and the protected tail, and uses a shared tolerance level to induce non-uniform pruning depths across layers. As a default vision instantiation, FAIR-Pruner combines a Wasserstein-based U-Score for class-conditional unit separability with a Taylor-based R-Score for task-level sensitivity; the same ToD allocation rule can also be paired with alternative removal signals.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.