$\rm{C {\small IS}}^2$: A Simplified Commonsense Inference Evaluation for Story Prose
Bryan Li, Lara J. Martin, and Chris Callison-Burch

TL;DR
This paper introduces $ m{C { iny IS}}^2$, a simplified evaluation task for commonsense inference in story prose that isolates inference ability from language generation, highlighting the need to separate these aspects in NLP models.
Contribution
The paper proposes a new, simplified task $ m{C { iny IS}}^2$ that avoids conflating language generation with inference evaluation in commonsense reasoning within story prose.
Findings
Highlights the conflation issue in existing datasets like GLUCOSE
Demonstrates the importance of disentangling language generation from inference tasks
Proposes a simplified, non-generative evaluation framework for CCI
Abstract
Transformers have been showing near-human performance on a variety of tasks, but they are not without their limitations. We discuss the issue of conflating results of transformers that are instructed to do multiple tasks simultaneously. In particular, we focus on the domain of commonsense reasoning within story prose, which we call contextual commonsense inference (CCI). We look at the GLUCOSE (Mostafazadeh et al. 2020) dataset and task for predicting implicit commonsense inferences between story sentences. Since the GLUCOSE task simultaneously generates sentences and predicts the CCI relation, there is a conflation in the results. Is the model really measuring CCI or is its ability to generate grammatical text carrying the results? In this paper, we introduce the task contextual commonsense inference in sentence selection (), a simplified task that avoids…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
