Research on Predicting Public Opinion Event Heat Levels Based on Large Language Models
Yi Ren, Tianyi Zhang, Weibin Li, DuoMu Zhou, Chenhao Qin, FangCheng, Dong

TL;DR
This study explores using large language models, especially GPT-4o, to predict the heat levels of public opinion events based on Chinese social media data, highlighting current accuracy limitations and future potential.
Contribution
It introduces a novel method leveraging large language models for public opinion event heat level prediction and evaluates their performance on a large Chinese dataset.
Findings
GPT-4o and DeepseekV2 achieved around 41% accuracy in predicting heat levels.
Prediction accuracy is higher for low-heat events, reaching over 70%.
Accuracy decreases as heat level increases, indicating data imbalance issues.
Abstract
In recent years, with the rapid development of large language models, serval models such as GPT-4o have demonstrated extraordinary capabilities, surpassing human performance in various language tasks. As a result, many researchers have begun exploring their potential applications in the field of public opinion analysis. This study proposes a novel large-language-models-based method for public opinion event heat level prediction. First, we preprocessed and classified 62,836 Chinese hot event data collected between July 2022 and December 2023. Then, based on each event's online dissemination heat index, we used the MiniBatchKMeans algorithm to automatically cluster the events and categorize them into four heat levels (ranging from low heat to very high heat). Next, we randomly selected 250 events from each heat level, totalling 1,000 events, to build the evaluation dataset. During the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Misinformation and Its Impacts · Computational and Text Analysis Methods
