Adaptive Activation Cancellation for Hallucination Mitigation in Large Language Models
Eric Yocam, Varghese Vaidyan, Gurcan Comert, Paris Kalathas, Yong Wang, Judith L. Mwakalonge

TL;DR
This paper introduces Adaptive Activation Cancellation (AAC), a real-time inference framework that suppresses hallucination-related activations in large language models, significantly improving factual accuracy without degrading language fluency or reasoning ability.
Contribution
AAC is a novel, inference-time method that identifies and suppresses hallucination-associated neurons in transformers without fine-tuning or external data, enhancing factual correctness.
Findings
Consistently improves accuracy on TruthfulQA and HaluEval across multiple models.
Preserves perplexity and reasoning accuracy with zero degradation.
Yields positive gains in generation quality and neuron selectivity.
Abstract
Large Language Models frequently generate fluent but factually incorrect text. We propose Adaptive Activation Cancellation (AAC), a real-time inference-time framework that treats hallucination-associated neural activations as structured interference within the transformer residual stream, drawing an explicit analogy to classical adaptive noise cancellation from signal processing. The framework identifies Hallucination Nodes (H-Nodes) via layer-wise linear probing and suppresses them using a confidence-weighted forward hook during auto-regressive generation -- requiring no external knowledge, no fine-tuning, and no additional inference passes. Evaluated across OPT-125M, Phi-3-mini, and LLaMA 3-8B on TruthfulQA and HaluEval, the real-time hook is the only intervention that consistently improves downstream accuracy on all three scales. Critically, the method is strictly surgical:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Ferroelectric and Negative Capacitance Devices · Advanced Memory and Neural Computing
