AERO: Entropy-Guided Framework for Private LLM Inference

Nandan Kumar Jha; Brandon Reagen

arXiv:2410.13060·cs.LG·November 3, 2025

AERO: Entropy-Guided Framework for Private LLM Inference

Nandan Kumar Jha, Brandon Reagen

PDF

Open Access

TL;DR

AERO is a framework that reduces nonlinear operations in transformer-based language models to improve privacy-preserving inference efficiency, balancing entropy to maintain model stability and diversity.

Contribution

It introduces an entropy-guided approach with adaptive regularization to strategically eliminate nonlinearities without performance loss.

Findings

01

Achieves 3.4× reduction in communication overhead

02

Reduces latency by 1.4×

03

Maintains model performance during nonlinear elimination

Abstract

Privacy-preserving computation enables language model inference directly on encrypted data yet suffers from prohibitive latency and communication overheads, primarily due to nonlinear functions. Removing nonlinearities, however, can trigger one of two failure modes restricting the potential for nonlinearity removal: entropy collapse in deeper layers, which destabilizes training, and entropic overload in early layers, causing under-utilization of attention heads. To address these challenges, we introduce AERO, an entropy-guided framework to strategically eliminates costly nonlinear operations from transformer architectures, which employs an adaptive recalibration through a head-wise entropy regularizer with learnable per-head strengths, enabling each head to adjust its entropy level while penalizing extreme entropies and fostering functional diversity through a tolerance margin.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCryptography and Data Security · Advanced Data Storage Technologies · Security and Verification in Computing

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Entropy Regularization