Large Language Models' Accuracy in Emulating Human Experts' Evaluation   of Public Sentiments about Heated Tobacco Products on Social Media

Kwanho Kim; Soojong Kim

arXiv:2502.01658·cs.CL·March 6, 2025

Large Language Models' Accuracy in Emulating Human Experts' Evaluation of Public Sentiments about Heated Tobacco Products on Social Media

Kwanho Kim, Soojong Kim

PDF

Open Access

TL;DR

This study evaluates the accuracy of GPT-3.5 and GPT-4 Turbo in replicating human sentiment analysis of social media messages about heated tobacco products, highlighting GPT-4 Turbo's superior performance.

Contribution

It provides empirical evidence on the effectiveness of LLMs, especially GPT-4 Turbo, in automating sentiment analysis for tobacco control research.

Findings

01

GPT-4 Turbo achieved up to 81.7% accuracy on Facebook messages.

02

GPT-4 Turbo's accuracy with three responses was 99% of twenty responses.

03

LLMs showed higher accuracy on anti- and pro-HTPs messages than neutral ones.

Abstract

Sentiment analysis of alternative tobacco products on social media is important for tobacco control research. Large Language Models (LLMs) can help streamline the labor-intensive human sentiment analysis process. This study examined the accuracy of LLMs in replicating human sentiment evaluation of social media messages about heated tobacco products (HTPs). The research used GPT-3.5 and GPT-4 Turbo to classify 500 Facebook and 500 Twitter messages, including anti-HTPs, pro-HTPs, and neutral messages. The models evaluated each message up to 20 times, and their majority label was compared to human evaluators. Results showed that GPT-3.5 accurately replicated human sentiment 61.2% of the time for Facebook messages and 57.0% for Twitter messages. GPT-4 Turbo performed better, with 81.7% accuracy for Facebook and 77.0% for Twitter. Using three response instances, GPT-4 Turbo achieved 99%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining

Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · Attention Is All You Need · Cosine Annealing · Label Smoothing · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Softmax · Dropout