Estimating Commonsense Plausibility through Semantic Shifts

Wanqing Cui; Wei Huang; Keping Bi; Jiafeng Guo; Xueqi Cheng

arXiv:2502.13464·cs.CL·April 21, 2026

Estimating Commonsense Plausibility through Semantic Shifts

Wanqing Cui, Wei Huang, Keping Bi, Jiafeng Guo, Xueqi Cheng

PDF

TL;DR

ComPaSS is a discriminative framework that measures semantic shifts caused by augmentations to evaluate commonsense plausibility, outperforming generative methods across various models and tasks.

Contribution

Introduces ComPaSS, a novel discriminative approach for fine-grained commonsense plausibility estimation based on semantic shifts, outperforming existing generative methods.

Findings

01

ComPaSS outperforms baselines on multiple plausibility tasks.

02

VLMs outperform LMs when used with ComPaSS.

03

Contrastive pre-training enhances semantic nuance detection.

Abstract

Commonsense plausibility estimation is critical for evaluating language models (LMs), yet existing generative approaches--reliant on likelihoods or verbalized judgments--struggle with fine-grained discrimination. In this paper, we propose ComPaSS, a novel discriminative framework that quantifies commonsense plausibility by measuring semantic shifts when augmenting sentences with commonsense-related information. Plausible augmentations induce minimal shifts in semantics, while implausible ones result in substantial deviations. Evaluations on two types of fine-grained commonsense plausibility estimation tasks across different backbones, including LLMs and vision-language models (VLMs), show that ComPaSS consistently outperforms baselines. It demonstrates the advantage of discriminative approaches over generative methods in fine-grained commonsense plausibility evaluation. Experiments also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.