AgenticPruner: MAC-Constrained Neural Network Compression via LLM-Driven Strategy Search
Shahrzad Esmat, Mahdi Banisharif, Ali Jannesari

TL;DR
AgenticPruner leverages large language models to perform MAC-constrained neural network pruning, achieving predictable inference latency and maintaining accuracy across diverse architectures by learning optimal strategies through iterative, context-aware analysis.
Contribution
This work introduces a novel LLM-driven framework for MAC-constrained pruning that automatically adapts strategies and guarantees computational budgets, improving convergence and deployment reliability.
Findings
Achieves MAC-targeted pruning with maintained or improved accuracy on ImageNet-1K.
Demonstrates GPU and CPU speedups with parameter reduction on ConvNeXt-Small.
Ensures MAC-budget compliance within strict tolerance bands for Vision Transformers.
Abstract
Neural network pruning remains essential for deploying deep learning models on resource-constrained devices, yet existing approaches primarily target parameter reduction without directly controlling computational cost. This yields unpredictable inference latency in deployment scenarios where strict Multiply-Accumulate (MAC) operation budgets must be met. We propose AgenticPruner, a framework utilizing large language models to achieve MAC-constrained optimization through iterative strategy learning. Our approach coordinates three specialized agents: a Profiling Agent that analyzes model architecture and MAC distributions, a Master Agent that orchestrates the workflow with divergence monitoring, and an Analysis Agent powered by Claude 3.5 Sonnet that learns optimal strategies from historical attempts. Through in-context learning, the Analysis Agent improves convergence success rate from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · IoT and Edge/Fog Computing · Software-Defined Networks and 5G
