Modeling Information Change in Science Communication with Semantically Matched Paraphrases
Dustin Wright, Jiaxin Pei, David Jurgens, Isabelle Augenstein

TL;DR
This paper introduces SPICED, a novel dataset of scientific paraphrases annotated for information change, enabling better understanding and tracking of scientific communication fidelity across various media.
Contribution
The creation of the SPICED dataset, the first of its kind, for analyzing scientific paraphrases and information change, along with demonstrating its utility in improving fact-checking and trend analysis.
Findings
Models trained on SPICED improve evidence retrieval for fact checking.
SPICED dataset reveals large-scale trends in scientific communication fidelity.
SPICED poses a challenging task for paraphrase detection in scientific texts.
Abstract
Whether the media faithfully communicate scientific information has long been a core issue to the science community. Automatically identifying paraphrased scientific findings could enable large-scale tracking and analysis of information changes in the science communication process, but this requires systems to understand the similarity between scientific information across multiple domains. To this end, we present the SCIENTIFIC PARAPHRASE AND INFORMATION CHANGE DATASET (SPICED), the first paraphrase dataset of scientific findings annotated for degree of information change. SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers. We demonstrate that SPICED poses a challenging task and that models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Misinformation and Its Impacts
