LLMs are Frequency Pattern Learners in Natural Language Inference
Liang Cheng, Zhaowei Wang, Mark Steedman

TL;DR
This paper investigates how fine-tuned LLMs in NLI tasks rely on frequency biases, revealing that they exploit these patterns for inference and perform poorly on bias-adversarial cases, which explains their improved performance.
Contribution
The study uncovers the role of frequency bias in LLMs' inference, demonstrating that models learn and exploit these patterns, affecting their robustness and interpretability.
Findings
LLMs exploit frequency bias in NLI datasets.
Fine-tuned LLMs perform poorly on bias-adversarial cases.
Frequency bias correlates with textual entailment patterns.
Abstract
While fine-tuning LLMs on NLI corpora improves their inferential performance, the underlying mechanisms driving this improvement remain largely opaque. In this work, we conduct a series of experiments to investigate what LLMs actually learn during fine-tuning. We begin by analyzing predicate frequencies in premises and hypotheses across NLI datasets and identify a consistent frequency bias, where predicates in hypotheses occur more frequently than those in premises for positive instances. To assess the impact of this bias, we evaluate both standard and NLI fine-tuned LLMs on bias-consistent and bias-adversarial cases. We find that LLMs exploit frequency bias for inference and perform poorly on adversarial instances. Furthermore, fine-tuned LLMs exhibit significantly increased reliance on this bias, suggesting that they are learning these frequency patterns from datasets. Finally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis
