Measuring News Similarity Across Ten U.S. News Sites
Grant C. Atkins, Alexander Nwala, Michele C. Weigle, Michael L. Nelson

TL;DR
This paper presents a method to identify top news stories and measure their similarity across ten U.S. news websites over three months, revealing how news coverage converges around major events like elections.
Contribution
The paper introduces a novel approach combining headline extraction and cosine similarity to quantify news similarity and identify top stories across multiple news sites.
Findings
Similarity increased before Election Day and after, indicating convergence in news coverage.
The method effectively identifies top stories and measures their similarity over time.
News coverage shows notable shifts during major events like elections.
Abstract
News websites make editorial decisions about what stories to include on their website homepages and what stories to emphasize (e.g., large font size for main story). The emphasized stories on a news website are often highly similar to many other news websites (e.g, a terrorist event story). The selective emphasis of a top news story and the similarity of news across different news organizations are well-known phenomena but not well-measured. We provide a method for identifying the top news story for a select set of U.S.-based news websites and then quantify the similarity across them. To achieve this, we first developed a headline and link extractor that parses select websites, and then examined ten United States based news website homepages during a three month period, November 2016 to January 2017. Using archived copies, retrieved from the Internet Archive (IA), we discuss the methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
