Entropy-Guided Attention for Private LLMs

Nandan Kumar Jha; Brandon Reagen

arXiv:2501.03489·cs.LG·January 10, 2025

Entropy-Guided Attention for Private LLMs

Nandan Kumar Jha, Brandon Reagen

PDF

Open Access 1 Repo

TL;DR

This paper introduces an information-theoretic framework using entropy to optimize transformer architectures for private inference in language models, addressing communication and latency challenges.

Contribution

It proposes an entropy-guided attention mechanism and regularization techniques to improve privacy-preserving language models by controlling entropy dynamics.

Findings

01

Removing nonlinearities causes entropy collapse and entropic overload.

02

Entropy-guided attention improves model stability and efficiency.

03

Proposed methods enable more practical private inference with LLMs.

Abstract

The pervasiveness of proprietary language models has raised critical privacy concerns, necessitating advancements in private inference (PI), where computations are performed directly on encrypted data without revealing users' sensitive information. While PI offers a promising solution, its practical deployment is hindered by substantial communication and latency overheads, primarily stemming from nonlinear operations. To address this, we introduce an information-theoretic framework to characterize the role of nonlinearities in decoder-only language models, laying a principled foundation for optimizing transformer-architectures tailored to the demands of PI. By leveraging Shannon's entropy as a quantitative measure, we uncover the previously unexplored dual significance of nonlinearities: beyond ensuring training stability, they are crucial for maintaining attention head diversity.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nandan91/entropy-guided-attention-llm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Rights Management and Security

MethodsSoftmax · Attention Is All You Need · Layer Normalization · Entropy Regularization