Can Small Language Models Learn, Unlearn, and Retain Noise Patterns?

Nicy Scaria; Silvester John Joseph Kennedy; Deepak Subramani

arXiv:2407.00996·cs.CL·May 28, 2025

Can Small Language Models Learn, Unlearn, and Retain Noise Patterns?

Nicy Scaria, Silvester John Joseph Kennedy, Deepak Subramani

PDF

Open Access 1 Repo

TL;DR

This paper investigates how small language models with 1-3 billion parameters learn, unlearn, and retain various noise patterns, revealing differences based on model size, training data quality, and adaptation strategies.

Contribution

It provides the first comprehensive analysis of noise handling in small language models, highlighting factors influencing their robustness and offering practical training strategies.

Findings

01

Smaller models like Olmo adapt quickly to noise patterns.

02

High-quality pretraining data in Phi2 enhances noise resistance.

03

Training on clean data mitigates noise effects effectively.

Abstract

With the growing need for efficient language models in resource-constrained environments, Small Language Models (SLMs) have emerged as compact and practical alternatives to Large Language Models (LLMs). While studies have explored noise handling in LLMs, little is known about how SLMs handle noise, a critical factor for their reliable real-world deployment. This study investigates the ability of SLMs with parameters between 1 and 3 billion to learn, retain, and subsequently eliminate different types of noise (word flip, character flip, transliteration, irrelevant content, and contradictory information). Four pretrained SLMs (Olmo 1B, Qwen1.5 1.8B, Gemma1.1 2B, and Phi2 2.7B) were instruction-tuned on noise-free data and tested with in-context examples to assess noise learning. Subsequently, noise patterns were introduced in instruction tuning to assess their adaptability. The results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

quest-lab-iisc/Learn-Unlearn-Relearn-Noise-SLMs
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis