# Message Distortion in Information Cascades

**Authors:** Manoel Horta Ribeiro, Kristina Gligori\'c, Robert West

arXiv: 1902.09197 · 2019-06-11

## TL;DR

This study investigates how information changes as it spreads in cascades, revealing that iterative summarization causes distortion, especially in conclusions, with extractive methods being less prone to semantic errors, impacting understanding of misinformation spread.

## Contribution

It introduces a controlled experiment to analyze message distortion in information cascades, highlighting the effects of iterative summarization and comparing abstractive versus extractive methods.

## Key findings

- Iterative summarization increases message distortion over steps.
- High-quality intermediate summaries reduce distortion.
- Extractive summaries are less prone to semantic distortion.

## Abstract

Information diffusion is usually modeled as a process in which immutable pieces of information propagate over a network. In reality, however, messages are not immutable, but may be morphed with every step, potentially entailing large cumulative distortions. This process may lead to misinformation even in the absence of malevolent actors, and understanding it is crucial for modeling and improving online information systems. Here, we perform a controlled, crowdsourced experiment in which we simulate the propagation of information from medical research papers. Starting from the original abstracts, crowd workers iteratively shorten previously produced summaries to increasingly smaller lengths. We also collect control summaries where the original abstract is compressed directly to the final target length. Comparing cascades to controls allows us to separate the effect of the length constraint from that of accumulated distortion. Via careful manual coding, we annotate lexical and semantic units in the medical abstracts and track them along cascades. We find that iterative summarization has a negative impact due to the accumulation of error, but that high-quality intermediate summaries result in less distorted messages than in the control case. Different types of information behave differently; in particular, the conclusion of a medical abstract (i.e., its key message) is distorted most. Finally, we compare abstractive with extractive summaries, finding that the latter are less prone to semantic distortion. Overall, this work is a first step in studying information cascades without the assumption that disseminated content is immutable, with implications on our understanding of the role of word-of-mouth effects on the misreporting of science.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.09197/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1902.09197/full.md

## References

58 references — full list in the complete paper: https://tomesphere.com/paper/1902.09197/full.md

---
Source: https://tomesphere.com/paper/1902.09197