Filling the Gap: Is Commonsense Knowledge Generation useful for Natural Language Inference?
Chathuri Jayaweera, Brianna Yanqui, Bonnie Dorr

TL;DR
This paper investigates whether generating and selectively using commonsense knowledge from Large Language Models can improve Natural Language Inference performance, showing modest accuracy gains and bias mitigation.
Contribution
It demonstrates that a hybrid approach with selective commonsense axioms enhances NLI accuracy and reduces bias, highlighting the value of targeted knowledge access.
Findings
Selective axioms improve NLI accuracy by up to 6.88%
Targeted commonsense knowledge reduces Neutral class bias
Hybrid approach outperforms baseline models
Abstract
Natural Language Inference (NLI) is the task of determining whether a premise entails, contradicts, or is neutral with respect to a given hypothesis. The task is often framed as emulating human inferential processes, in which commonsense knowledge plays a major role. This study examines whether Large Language Models (LLMs) can generate useful commonsense axioms for Natural Language Inference, and evaluates their impact on performance using the SNLI and ANLI benchmarks with the Llama-3.1-70B and gpt-oss-120b models. We show that a hybrid approach, which selectively provides highly factual axioms based on judged helpfulness, yields consistent accuracy improvements of 1.99% to 6.88% across tested configurations, demonstrating the effectiveness of selective knowledge access for NLI. We also find that this targeted use of commonsense knowledge helps models overcome a bias toward the Neutral…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications
