# Evaluating Large Language Models for Sentiment Analysis and Hesitancy Analysis on Vaccine Posts From Social Media: Qualitative Study

**Authors:** Augustine Annan, Amanda L Eiden, Dong Wang, Jingcheng Du, Majid Rastegar-Mojarad, Varun Kumar Nomula, Xiaoyan Wang

PMC · DOI: 10.2196/64723 · 2025-10-15

## TL;DR

This study compares large language models for analyzing vaccine sentiment and hesitancy on social media, finding GPT-4 to be the most accurate but not the most cost-effective.

## Contribution

The study evaluates and compares the performance of multiple large language models in vaccine sentiment and hesitancy analysis on social media data.

## Key findings

- GPT-4 outperformed other models in accuracy and F1-score for vaccine sentiment and hesitancy analysis.
- Few-shot learning provided minimal performance gains but increased computational costs.
- Zero-shot learning was found to be computationally more efficient than few-shot learning.

## Abstract

In the digital age, social media has become a crucial platform for public discourse on diverse health-related topics, including vaccines. Efficient sentiment analysis and hesitancy detection are essential for understanding public opinions and concerns. Large language models (LLMs) offer advanced capabilities for processing complex linguistic patterns, potentially providing valuable insights into vaccine-related discourse.

This study aims to evaluate the performance of various LLMs in sentiment analysis and hesitancy detection related to vaccine discussions on social media and identify the most efficient, accurate, and cost-effective model for detecting vaccine-related public sentiment and hesitancy trends.

We used several LLMs—generative pretrained transformer (GPT-3.5), GPT-4, Claude-3 Sonnet, and Llama 2—to process and classify complex linguistic data related to human papillomavirus; measles, mumps, and rubella; and vaccines overall from X (formerly known as Twitter), Reddit, and YouTube. The models were tested across different learning paradigms: zero-shot, 1-shot, and few-shot to determine their adaptability and learning efficiency with varying amounts of training data. We evaluated the models’ performance using accuracy, F1-score, precision, and recall. In addition, we conducted a cost analysis focused on token usage to assess the computational efficiency of each approach.

GPT-4 (F1-score=0.85 and accuracy=0.83) outperformed GPT-3.5, Llama 2, and Claude-3 Sonnet across various metrics, regardless of the sentiment type or learning paradigm. Few-shot learning did not significantly enhance performance compared with the zero-shot paradigm. Moreover, the increased computational costs and token usage associated with few-shot learning did not justify its application, given the marginal improvement in model performance. The analysis highlighted challenges in classifying neutral sentiments and convenience, correctly interpreting sarcasm, and accurately identifying indirect expressions of vaccine hesitancy, emphasizing the need for model refinement.

GPT-4 emerged as the most accurate model, excelling in sentiment and hesitancy analysis. Performance differences between learning paradigms were minimal, making zero-shot learning preferable for its balance of accuracy and computational efficiency. However, the zero-shot GPT-4 model is not the most cost-effective compared with traditional machine learning. A hybrid approach, using LLMs for initial annotation and traditional models for training, could optimize cost and performance. Despite reliance on specific LLM versions and a limited focus on certain vaccine types and platforms, our findings underscore the capabilities and limitations of LLMs in vaccine sentiment and hesitancy analysis, highlighting the need for ongoing evaluation and adaptation in public health communication strategies.

## Linked entities

- **Diseases:** measles (MONDO:0004619), mumps (MONDO:0000989), rubella (MONDO:0004656)

## Full-text entities

- **Diseases:** measles, mumps, and rubella (MESH:D009107)
- **Species:** Human papillomavirus (species) [taxon 10566]

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12526656/full.md

---
Source: https://tomesphere.com/paper/PMC12526656