Self-Knowledge Distillation for Learning Ambiguity
Hancheol Park, Soyeong Jeong, Sukmin Cho, Jong C. Park

TL;DR
This paper introduces a self-knowledge distillation technique that improves language models' handling of ambiguous samples by better calibrating confidence levels, leading to more accurate label distributions and enhanced performance on NLU tasks.
Contribution
The proposed method leverages lower-layer knowledge for self-distillation and re-calibrates confidence on ambiguous samples without extra training, advancing label distribution learning in NLU.
Findings
Improves label distribution quality on NLU benchmarks.
Reduces over-confidence in predictions for ambiguous samples.
More efficient training compared to existing methods.
Abstract
Recent language models have shown remarkable performance on natural language understanding (NLU) tasks. However, they are often sub-optimal when faced with ambiguous samples that can be interpreted in multiple ways, over-confidently predicting a single label without consideration for its correctness. To address this issue, we propose a novel self-knowledge distillation method that enables models to learn label distributions more accurately by leveraging knowledge distilled from their lower layers. This approach also includes a learning phase that re-calibrates the unnecessarily strengthened confidence for training samples judged as extremely ambiguous based on the distilled distribution knowledge. We validate our method on diverse NLU benchmark datasets and the experimental results demonstrate its effectiveness in producing better label distributions. Particularly, through the process…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Online Learning and Analytics
