Quantifying Semantic Emergence in Language Models

Hang Chen; Xinyu Yang; Jiaying Zhu; Wenya Wang

arXiv:2405.12617·cs.CL·December 19, 2024

Quantifying Semantic Emergence in Language Models

Hang Chen, Xinyu Yang, Jiaying Zhu, Wenya Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new metric called Information Emergence (IE) to quantify how well large language models extract meaningful semantics from input tokens, providing insights into their semantic understanding capabilities.

Contribution

The paper proposes a novel, task- and architecture-agnostic metric, Information Emergence, for measuring semantic extraction in language models, along with a lightweight estimator for mutual information.

Findings

01

IE reveals patterns consistent with linguistic knowledge

02

Some IE patterns are unexpected, offering new insights

03

Experiments validate IE's informativeness in different contexts

Abstract

Large language models (LLMs) are widely recognized for their exceptional capacity to capture semantics meaning. Yet, there remains no established metric to quantify this capability. In this work, we introduce a quantitative metric, Information Emergence (IE), designed to measure LLMs' ability to extract semantics from input tokens. We formalize ``semantics'' as the meaningful information abstracted from a sequence of tokens and quantify this by comparing the entropy reduction observed for a sequence of tokens (macro-level) and individual tokens (micro-level). To achieve this, we design a lightweight estimator to compute the mutual information at each transformer layer, which is agnostic to different tasks and language model architectures. We apply IE in both synthetic in-context learning (ICL) scenarios and natural sentence contexts. Experiments demonstrate informativeness and patterns…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zodiark-ch/emergence-of-llms
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and cultural evolution · Topic Modeling · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Layer Normalization · Discriminative Fine-Tuning · Multi-Head Attention · Dense Connections · Cosine Annealing · Attention Dropout · Weight Decay