LLM Ghostbusters: Surgical Hallucination Suppression via Adaptive Unlearning

Joseph Spracklen; Pedram Aghazadeh; Farinaz Koushanfar; Murtuza Jadliwala

arXiv:2605.01047·cs.CR·May 5, 2026

LLM Ghostbusters: Surgical Hallucination Suppression via Adaptive Unlearning

Joseph Spracklen, Pedram Aghazadeh, Farinaz Koushanfar, Murtuza Jadliwala

PDF

TL;DR

This paper introduces Adaptive Unlearning, a post-deployment method for large language models that suppresses hallucinations like fictional package recommendations, reducing security vulnerabilities without harming overall performance.

Contribution

It presents a novel hybrid token-level objective and adaptive discovery loop for surgically unlearning hallucinations in deployed LLMs, enhancing security and reliability.

Findings

01

Reduces package hallucination rates by 81%

02

Maintains performance on standard coding benchmarks

03

Effectively isolates hallucination suppression to targeted distributions

Abstract

Hallucinations, outputs that sound plausible but are factually incorrect, remain an open challenge for deployed LLMs. In code generation, models frequently hallucinate non-existent software packages, recommending imports and installation commands for fictional libraries. This creates a critical supply-chain vulnerability: an attacker can proactively register such packages on public registries with malicious payloads that are subsequently installed and executed by developers or autonomous agents, a class of package confusion attack known as slopsquatting. Once a model is deployed, mitigating this failure mode is difficult: full retraining is costly, and existing approaches either cause severe degradation of model utility or rely on a pre-specified forget-set, an assumption that does not apply to the unbounded space of hallucinations. To address this problem, we present Adaptive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.