Scaling Down Semantic Leakage: Investigating Associative Bias in Smaller   Language Models

Veronika Smilga

arXiv:2501.06638·cs.CL·January 14, 2025

Scaling Down Semantic Leakage: Investigating Associative Bias in Smaller Language Models

Veronika Smilga

PDF

1 Repo

TL;DR

This study investigates whether smaller language models exhibit less semantic leakage than larger ones by systematically evaluating models from 500M to 7B parameters using a new color-focused dataset.

Contribution

It introduces a new dataset and evaluation framework to compare semantic leakage across smaller and larger language models, revealing nuanced leakage patterns.

Findings

01

Smaller models generally show less semantic leakage.

02

Leakage does not decrease monotonically with model size.

03

Medium-sized models can sometimes leak more than larger models.

Abstract

Semantic leakage is a phenomenon recently introduced by Gonen et al. (2024). It refers to a situation in which associations learnt from the training data emerge in language model generations in an unexpected and sometimes undesired way. Prior work has focused on leakage in large language models (7B+ parameters). In this study, I use Qwen2.5 model family to explore whether smaller models, ranging from 500M to 7B parameters, demonstrate less semantic leakage due to their limited capacity for capturing complex associations. Building on the previous dataset from Gonen et al. (2024), I introduce a new dataset of color-focused prompts, categorized into specific types of semantic associations, to systematically evaluate the models' performance. Results indicate that smaller models exhibit less semantic leakage overall, although this trend is not strictly linear, with medium-sized models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

smilni/semantic_leakage_project
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.