Assessing Large Language Models for Online Extremism Research:   Identification, Explanation, and New Knowledge

Beidi Dong; Jin R. Lee; Ziwei Zhu; Balassubramanian Srinivasan

arXiv:2408.16749·cs.CL·August 30, 2024·3 cites

Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge

Beidi Dong, Jin R. Lee, Ziwei Zhu, Balassubramanian Srinivasan

PDF

Open Access

TL;DR

This study evaluates the effectiveness of BERT and GPT models in detecting online extremism, finding GPT models outperform BERT, with prompt design significantly affecting classification accuracy across different extremist categories.

Contribution

It introduces a comparative analysis of BERT and GPT models for extremism detection, highlighting GPT's superior zero-shot performance and the impact of prompt engineering.

Findings

01

GPT models outperform BERT in extremism classification

02

Prompt complexity influences GPT model performance

03

GPT 3.5 better at far-left, GPT 4 better at far-right extremism

Abstract

The United States has experienced a significant increase in violent extremism, prompting the need for automated tools to detect and limit the spread of extremist ideology online. This study evaluates the performance of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) in detecting and classifying online domestic extremist posts. We collected social media posts containing "far-right" and "far-left" ideological keywords and manually labeled them as extremist or non-extremist. Extremist posts were further classified into one or more of five contributing elements of extremism based on a working definitional framework. The BERT model's performance was evaluated based on training data size and knowledge transfer between categories. We also compared the performance of GPT 3.5 and GPT 4 models using different prompts: na\"ive,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTerrorism, Counterterrorism, and Political Violence · Hate Speech and Cyberbullying Detection

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Cosine Annealing · Linear Layer · Adam · Layer Normalization · Weight Decay · Attention Is All You Need · Dense Connections · WordPiece · Residual Connection