Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning
Yiming Wang, Pei Zhang, Baosong Yang, Derek F. Wong, Zhuosheng Zhang,, Rui Wang

TL;DR
This paper introduces a trajectory-based out-of-distribution detection method for mathematical reasoning tasks in generative language models, addressing challenges posed by high-density output spaces and outperforming traditional algorithms.
Contribution
The paper proposes the TV score, a novel trajectory volatility measure, for effective OOD detection in complex mathematical reasoning scenarios.
Findings
TV score outperforms traditional OOD detection methods in mathematical reasoning tasks.
Method extends to applications with high-density output spaces like multiple-choice questions.
Demonstrates robustness of trajectory-based detection in complex generative scenarios.
Abstract
Real-world data deviating from the independent and identically distributed (i.i.d.) assumption of in-distribution training data poses security threats to deep networks, thus advancing out-of-distribution (OOD) detection algorithms. Detection methods in generative language models (GLMs) mainly focus on uncertainty estimation and embedding distance measurement, with the latter proven to be most effective in traditional linguistic tasks like summarization and translation. However, another complex generative scenario mathematical reasoning poses significant challenges to embedding-based methods due to its high-density feature of output spaces, but this feature causes larger discrepancies in the embedding shift trajectory between different samples in latent spaces. Hence, we propose a trajectory-based method TV score, which uses trajectory volatility for OOD detection in mathematical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsBayesian Modeling and Causal Inference
MethodsFocus
