Topic Modeling and Progression of American Digital News Media During the Onset of the COVID-19 Pandemic
Xiangpeng Wan, Michael C. Lucic, Hakim Ghazzai, Yehia Massoud

TL;DR
This paper presents an NLP pipeline that analyzes and summarizes COVID-19 related digital news articles in the US, tracking how pandemic topics evolved over time to help readers understand the information landscape.
Contribution
It introduces a novel NLP framework combining unsupervised and semi-supervised learning, clustering, and topic modeling to analyze digital news media progression during COVID-19.
Findings
Identified key COVID-19 topics over time
Clustered articles based on similarity using community detection
Mapped the evolution of pandemic-related discussions
Abstract
Currently, the world is in the midst of a severe global pandemic, which has affected all aspects of people's lives. As a result, there is a deluge of COVID-related digital media articles published in the United States, due to the disparate effects of the pandemic. This large volume of information is difficult to consume by the audience in a reasonable amount of time. In this paper, we develop a Natural Language Processing (NLP) pipeline that is capable of automatically distilling various digital articles into manageable pieces of information, while also modelling the progression topics discussed over time in order to aid readers in rapidly gaining holistic perspectives on pressing issues (i.e., the COVID-19 pandemic) from a diverse array of sources. We achieve these goals by first collecting a large corpus of COVID-related articles during the onset of the pandemic. After, we apply…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques · Topic Modeling
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Softmax · Byte Pair Encoding · Dropout · Adam · Layer Normalization · Multi-Head Attention · Refunds@Expedia|||How do I get a full refund from Expedia?
