Entropy-UID: A Method for Optimizing Information Density

Xinpeng Shou

arXiv:2502.14366·cs.CL·February 21, 2025

Entropy-UID: A Method for Optimizing Information Density

Xinpeng Shou

PDF

Open Access

TL;DR

Entropy-UID introduces an adaptive token selection method that balances entropy and Uniform Information Density principles, improving the efficiency and naturalness of language generation models.

Contribution

The paper proposes Entropy-UID, a novel token selection approach that jointly minimizes entropy and surprisal for better information distribution in text generation.

Findings

01

Achieves lower surprisal and entropy variance than baseline models.

02

Produces more balanced and human-like generated text.

03

Validated on multiple benchmark datasets with consistent improvements.

Abstract

Balanced and efficient information flow is essential for optimizing language generation models. In this work, we propose Entropy-UID, a new token selection method that balances entropy and Uniform Information Density (UID) principles for enhanced efficiency of text generation. Our approach adaptively adjusts token selection by jointly minimizing entropy and surprisal, promoting more even information distribution across generated sequences. Theoretical validation demonstrates that Entropy-UID optimally reduces information spikes while maintaining fluency and coherence. The method has been evulated using information-theoretic metrics on multiple benchmark datasets, including WikiText-2, OpenWebText, and WMT. Experimental results show that Entropy-UID achieves lower surprisal and entropy variance compared to standard GPT-2 and alternative heuristics, leading to more balanced and human-like…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Layer · Dense Connections · Attention Dropout · Discriminative Fine-Tuning · Multi-Head Attention · Adam · Softmax