MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking
Ting-Chih Chen, Chia-Wei Tang, Chris Thomas

TL;DR
MetaSumPerceiver is a multimodal, multi-document summarization model designed to generate claim-specific summaries to aid fact-checking, utilizing a dynamic perceiver architecture and reinforcement learning for evidence extraction.
Contribution
The paper introduces a novel multimodal, multi-document summarization model with a dynamic perceiver architecture and reinforcement learning-based training for fact-checking.
Findings
Outperforms SOTA by 4.6% on MOCHEG dataset
Effective on new Multi-News-Fact-Checking dataset
Handles arbitrary-length multimodal inputs
Abstract
Fact-checking real-world claims often requires reviewing multiple multimodal documents to assess a claim's truthfulness, which is a highly laborious and time-consuming task. In this paper, we present a summarization model designed to generate claim-specific summaries useful for fact-checking from multimodal, multi-document datasets. The model takes inputs in the form of documents, images, and a claim, with the objective of assisting in fact-checking tasks. We introduce a dynamic perceiver-based model that can handle inputs from multiple modalities of arbitrary lengths. To train our model, we leverage a novel reinforcement learning-based entailment objective to generate summaries that provide evidence distinguishing between different truthfulness labels. To assess the efficacy of our approach, we conduct experiments on both an existing benchmark and a new dataset of multi-document claims…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
