Large Language Models and Thematic Analysis: Human-AI Synergy in Researching Hate Speech on Social Media
Petre Breazu, Miriam Schirmer, Songbo Hu, Napoleon Katsos

TL;DR
This paper explores the use of GPT-4 for thematic analysis of social media data, highlighting its potential and limitations in qualitative research within social sciences.
Contribution
It provides an experimental case study on applying LLMs to thematic analysis of social media content, an area previously underexplored.
Findings
GPT-4 can assist in thematic analysis of social media data.
LLMs offer scalability and efficiency but have limitations in qualitative nuance.
The study discusses future applications of LLMs in social sciences.
Abstract
In the dynamic field of artificial intelligence (AI), the development and application of Large Language Models (LLMs) for text analysis are of significant academic interest. Despite the promising capabilities of various LLMs in conducting qualitative analysis, their use in the humanities and social sciences has not been thoroughly examined. This article contributes to the emerging literature on LLMs in qualitative analysis by documenting an experimental study involving GPT-4. The study focuses on performing thematic analysis (TA) using a YouTube dataset derived from an EU-funded project, which was previously analyzed by other researchers. This dataset is about the representation of Roma migrants in Sweden during 2016, a period marked by the aftermath of the 2015 refugee crisis and preceding the Swedish national elections in 2017. Our study seeks to understand the potential of combining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
MethodsLinear Layer · Layer Normalization · Multi-Head Attention · Attention Is All You Need · Position-Wise Feed-Forward Layer · Adam · Byte Pair Encoding · Softmax · Absolute Position Encodings · Dense Connections
