Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training
Shahrad Mohammadzadeh, Juan David Guerra, Marco Bonizzato, Reihaneh Rabbany, Golnoosh Farnadi

TL;DR
This paper introduces Sensitivity Dropout (SenD), a new training method for large language models that reduces hallucination variance and improves factual accuracy without harming task performance.
Contribution
We propose SenD, a novel training protocol that deterministically drops high-variance embeddings, and EES, an efficient hallucination detection metric, enhancing LLM reliability.
Findings
SenD reduces hallucination variance during training.
EES provides fast, unsupervised hallucination detection.
SenD improves factual accuracy by up to 17% in tested models.
Abstract
As large language models (LLMs) become increasingly prevalent, concerns about their reliability, particularly due to hallucinations - factually inaccurate or irrelevant outputs - have grown. Our research investigates the relationship between the uncertainty in training dynamics and the emergence of hallucinations. Using models from the Pythia suite and several hallucination detection metrics, we analyze hallucination trends and identify significant variance during training. To address this, we propose Sensitivity Dropout (SenD), a novel training protocol designed to reduce hallucination variance during training by deterministically dropping embedding indices with significant variability. In addition, we develop an unsupervised hallucination detection metric, Efficient EigenScore (EES), which approximates the traditional EigenScore in 2x speed. This metric is integrated into our training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification · Anomaly Detection Techniques and Applications · Seismology and Earthquake Studies
MethodsHigh-Order Consensuses · Dropout · Pythia
