Am I Blue or Is My Hobby Counting Teardrops? Expression Leakage in Large Language Models as a Symptom of Irrelevancy Disruption

Berkay K\"opr\"u; Mehrzad Mashal; Yigit Gurses; Akos Kadar; Maximilian Schmitt; Ditty Mathew; Felix Burkhardt; Florian Eyben; Bj\"orn W. Schuller

arXiv:2508.01708·cs.CL·August 5, 2025

Am I Blue or Is My Hobby Counting Teardrops? Expression Leakage in Large Language Models as a Symptom of Irrelevancy Disruption

Berkay K\"opr\"u, Mehrzad Mashal, Yigit Gurses, Akos Kadar, Maximilian Schmitt, Ditty Mathew, Felix Burkhardt, Florian Eyben, Bj\"orn W. Schuller

PDF

Open Access

TL;DR

This paper introduces 'expression leakage,' a new form of irrelevant output in large language models, analyzes its causes, and proposes benchmarks and methods to evaluate and mitigate it, revealing scale and prompt effects.

Contribution

It defines and measures expression leakage, provides a benchmark dataset and evaluation pipeline, and shows how model scale and prompt sentiment influence leakage.

Findings

01

Expression leakage decreases as models scale within the same family.

02

Mitigating expression leakage requires specific training considerations, not just prompting.

03

Negative sentiment prompts increase expression leakage more than positive ones.

Abstract

Large language models (LLMs) have advanced natural language processing (NLP) skills such as through next-token prediction and self-attention, but their ability to integrate broad context also makes them prone to incorporating irrelevant information. Prior work has focused on semantic leakage, bias introduced by semantically irrelevant context. In this paper, we introduce expression leakage, a novel phenomenon where LLMs systematically generate sentimentally charged expressions that are semantically unrelated to the input context. To analyse the expression leakage, we collect a benchmark dataset along with a scheme to automatically generate a dataset from free-form text from common-crawl. In addition, we propose an automatic evaluation pipeline that correlates well with human judgment, which accelerates the benchmarking by decoupling from the need of annotation for each analysed model.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)