EGGS-PTP: An Expander-Graph Guided Structured Post-training Pruning Method for Large Language Models

Omar Bazarbachi; Zijun Sun; Yanning Shen

arXiv:2508.09471·cs.LG·August 14, 2025

EGGS-PTP: An Expander-Graph Guided Structured Post-training Pruning Method for Large Language Models

Omar Bazarbachi, Zijun Sun, Yanning Shen

PDF

3 Reviews

TL;DR

EGGS-PTP is a novel structured post-training pruning method for large language models that uses expander graph theory to maintain information flow, resulting in efficient models with preserved accuracy.

Contribution

The paper introduces a graph-theory-based structured pruning approach that enhances model efficiency while maintaining performance in large language models.

Findings

01

Significant reduction in model size and computation

02

Outperforms existing pruning methods in accuracy

03

Effective preservation of model functionality

Abstract

As Large Language Models (LLMs) become more widely adopted and scale up in size, the computational and memory challenges involved in deploying these massive foundation models have grown increasingly severe. This underscores the urgent need to develop more efficient model variants. Faced with this challenge, the present work introduces EGGS-PTP: an Expander-Graph Guided Structured Post-training Pruning method. The proposed approach leverages graph theory to guide the design of N:M structured pruning, effectively reducing model size and computational demands. By incorporating concepts from expander graphs, EGGS-PTP ensures information flow within the pruned network, preserving essential model functionality. Extensive numerical experiments demonstrate that EGGS-PTP not only achieves significant acceleration and memory savings due to structured sparsity but also outperforms existing…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 6Confidence 3

Strengths

1. EGGS-PTP introduces expander-graph theory into post-training pruning, presenting the first framework that applies expander graph concepts to large language model pruning. It innovatively leverages graph-theoretic properties such as connectivity and expansion to maintain robust information flow in pruned models. 2. It combines importance-aware and connectivity-aware pruning to balance compression efficiency and model accuracy. 3. The method enforces N:M structured sparsity compatible with GPU

Weaknesses

1. EGGS-PTP mainly integrates expander-graph theory with existing pruning frameworks rather than introducing a fundamentally new learning mechanism, relying on heuristic rules instead of adaptive structures. 2. The method incurs higher pruning overhead than baselines like RIA, and its scalability beyond 34B-parameter models remains untested. 3. It depends on manual tuning of the hyperparameter (B), limiting automation and generalization across different architectures.

Reviewer 02Rating 4Confidence 4

Strengths

The additional diagonal selection leads to improvements over the RIA metric. Results are positive across perplexity and zero-shot task results, across several models.

Weaknesses

Results are quite close to RIA; confidence intervals would help strengthen the claims of improved performance. The theory seems a bit disjointed. Why does it matter that we produce a two-sided expander? It is unclear what contribution this theory adds aside from some inspiration for the method. It would be nice to see either an improved explanation of why the expander graph theory is useful, or some further connections to claims made in the paper. For instance, if this framework improves infor

Reviewer 03Rating 4Confidence 2

Strengths

- The paper introduces an interesting perspective by connecting structured sparsity with expander graph theory, aiming to preserve information flow in pruned LLMs. - The authors evaluate on multiple LLMs and datasets, providing a broad empirical view. - The paper provides sufficient implementation details for reproduction.

Weaknesses

- Overstated theoretical claims. - Insufficient ablation on graph hyperparameters.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.