Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text
Qianhui Wu, Huiqiang Jiang, Haonan Yin, B\"orje F. Karlsson, Chin-Yew, Lin

TL;DR
This paper introduces a multi-level knowledge distillation method that combines fine-tuned and pre-trained models to improve out-of-distribution detection in text, achieving state-of-the-art results across multiple benchmarks.
Contribution
It proposes a novel multi-level knowledge distillation approach that integrates prediction and intermediate layer distillation to enhance OoD detection capabilities.
Findings
Achieves new state-of-the-art performance on multiple benchmark datasets.
Effectively distinguishes between in-distribution and out-of-distribution texts.
Outperforms human evaluators in detecting ChatGPT-generated answers.
Abstract
Self-supervised representation learning has proved to be a valuable component for out-of-distribution (OoD) detection with only the texts of in-distribution (ID) examples. These approaches either train a language model from scratch or fine-tune a pre-trained language model using ID examples, and then take the perplexity output by the language model as OoD scores. In this paper, we analyze the complementary characteristics of both OoD detection methods and propose a multi-level knowledge distillation approach that integrates their strengths while mitigating their limitations. Specifically, we use a fine-tuned model as the teacher to teach a randomly initialized student model on the ID examples. Besides the prediction layer distillation, we present a similarity-based intermediate layer distillation method to thoroughly explore the representation space of the teacher model. In this way,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Text and Document Classification Technologies
MethodsKnowledge Distillation
