How Out-of-Distribution Detection Learning Theory Enhances Transformer: Learnability and Reliability
Yijin Zhou, Yutang Ge, Wenyuan Xie, Linqian Zeng, Xiaowen Dong, Yuguang Wang

TL;DR
This paper develops a theoretical framework for Out-of-Distribution detection in transformers, demonstrating how auxiliary outliers and loss function strategies can improve model reliability and achieve state-of-the-art results.
Contribution
It introduces the OOD detection PAC theory for transformers, linking data conditions with model learnability and proposing a novel algorithm leveraging synthetic outliers.
Findings
Theoretical conditions for OOD detection learnability in transformers.
Enhanced OOD detection performance with auxiliary outliers.
State-of-the-art results across multiple data formats.
Abstract
Transformers excel in natural language processing and computer vision tasks. However, they still face challenges in generalizing to Out-of-Distribution (OOD) datasets, i.e. data whose distribution differs from that seen during training. OOD detection aims to distinguish outliers while preserving in-distribution (ID) data performance. This paper introduces the OOD detection Probably Approximately Correct (PAC) Theory for transformers, which establishes the conditions for data distribution and model configurations for the OOD detection learnability of transformers. It shows that outliers can be accurately represented and distinguished with sufficient data under conditions. The theoretical implications highlight the trade-off between theoretical principles and practical training paradigms. By examining this trade-off, we naturally derived the rationale for leveraging auxiliary outliers to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNon-Destructive Testing Techniques · Image and Signal Denoising Methods · High voltage insulation and dielectric phenomena
