WeCheck: Strong Factual Consistency Checker via Weakly Supervised   Learning

Wenhao Wu; Wei Li; Xinyan Xiao; Jiachen Liu; Sujian Li; Yajuan Lv

arXiv:2212.10057·cs.CL·May 30, 2023·1 cites

WeCheck: Strong Factual Consistency Checker via Weakly Supervised Learning

Wenhao Wu, Wei Li, Xinyan Xiao, Jiachen Liu, Sujian Li, Yajuan Lv

PDF

Open Access 1 Repo 1 Models

TL;DR

WeCheck is a weakly supervised factual consistency metric that leverages multiple resources and generative labeling to improve accuracy in evaluating generated text's factual correctness.

Contribution

The paper introduces WeCheck, a novel weakly supervised framework that effectively aggregates resources and handles noise to assess factual consistency in text generation.

Findings

01

Achieves 3.4% improvement on TRUE benchmark

02

Outperforms previous state-of-the-art methods

03

Demonstrates strong performance across various tasks

Abstract

A crucial issue of current text generation models is that they often uncontrollably generate factually inconsistent text with respective of their inputs. Limited by the lack of annotated data, existing works in evaluating factual consistency directly transfer the reasoning ability of models trained on other data-rich upstream tasks like question answering (QA) and natural language inference (NLI) without any further adaptation. As a result, they perform poorly on the real generated text and are biased heavily by their single-source upstream tasks. To alleviate this problem, we propose a weakly supervised framework that aggregates multiple resources to train a precise and efficient factual metric, namely WeCheck. WeCheck first utilizes a generative model to accurately label a real generated sample by aggregating its weak labels, which are inferred from multiple resources. Then, we train…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nightdessert/WeCheck/blob/main/README.md
pytorch

Models

🤗
nightdessert/WeCheck
model· 29 dl· ♡ 2
29 dl♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications