ComFact: A Benchmark for Linking Contextual Commonsense Knowledge

Silin Gao; Jena D. Hwang; Saya Kanno; Hiromi Wakaki; Yuki Mitsufuji,; Antoine Bosselut

arXiv:2210.12678·cs.CL·October 25, 2022·1 cites

ComFact: A Benchmark for Linking Contextual Commonsense Knowledge

Silin Gao, Jena D. Hwang, Saya Kanno, Hiromi Wakaki, Yuki Mitsufuji,, Antoine Bosselut

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces ComFact, a benchmark for linking relevant commonsense knowledge to contexts in dialogues and stories, highlighting the limitations of heuristic methods and demonstrating the potential of learned models to improve knowledge retrieval and downstream tasks.

Contribution

We propose the new task of commonsense fact linking and create the ComFact benchmark with extensive annotations, enabling evaluation of models in identifying situationally-relevant knowledge.

Findings

01

Heuristic fact linking approaches are imprecise.

02

Learned models improve F1 scores by approximately 34.6%.

03

Enhanced knowledge retrieval boosts dialogue response quality by 9.8%.

Abstract

Understanding rich narratives, such as dialogues and stories, often requires natural language processing systems to access relevant knowledge from commonsense knowledge graphs. However, these systems typically retrieve facts from KGs using simple heuristics that disregard the complex challenges of identifying situationally-relevant commonsense knowledge (e.g., contextualization, implicitness, ambiguity). In this work, we propose the new task of commonsense fact linking, where models are given contexts and trained to identify situationally-relevant commonsense knowledge from KGs. Our novel benchmark, ComFact, contains ~293k in-context relevance annotations for commonsense triplets across four stylistically diverse dialogue and storytelling datasets. Experimental results confirm that heuristic fact linking approaches are imprecise knowledge extractors. Learned fact linking models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

silin159/comfact
pytorchOfficial

Datasets

Silin1590/ComFact
dataset· 8 dl
8 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems