3DLNews: A Three-decade Dataset of US Local News Articles
Gangani Ariyarathne, Alexander C. Nwala

TL;DR
3DLNews is a comprehensive dataset of nearly one million US local news articles from 1996 to 2024, covering diverse media sources and enriched with metadata, enabling various research applications.
Contribution
The paper introduces 3DLNews, a large-scale, multi-decade dataset of US local news articles with detailed metadata, created through systematic scraping and filtering methods.
Findings
Dataset covers 1996-2024 with nearly 1 million articles.
Includes metadata like source info and publication dates.
Demonstrates utility through four example applications.
Abstract
We present 3DLNews, a novel dataset with local news articles from the United States spanning the period from 1996 to 2024. It contains almost 1 million URLs (with HTML text) from over 14,000 local newspapers, TV, and radio stations across all 50 states, and provides a broad snapshot of the US local news landscape. The dataset was collected by scraping Google and Twitter search results. We employed a multi-step filtering process to remove non-news article links and enriched the dataset with metadata such as the names and geo-coordinates of the source news media organizations, article publication dates, etc. Furthermore, we demonstrated the utility of 3DLNews by outlining four applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods
