Leveraging Large Language Models and Topic Modeling for Toxicity Classification
Haniyeh Ehsani Oskouie, Christina Chance, Claire Huang, Margaret, Capetz, Elizabeth Eyeson, Majid Sarrafzadeh

TL;DR
This paper investigates how fine-tuning large language models with topic modeling can improve toxicity classification, revealing limitations of current models and emphasizing the influence of annotator bias on model performance.
Contribution
It introduces a combined approach of fine-tuning BERTweet and HateBERT with topic modeling to enhance toxicity detection accuracy.
Findings
Fine-tuning models on specific topics improves F1 scores.
State-of-the-art models still struggle with toxicity detection accuracy.
Annotator bias impacts model training and outcomes.
Abstract
Content moderation and toxicity classification represent critical tasks with significant social implications. However, studies have shown that major classification models exhibit tendencies to magnify or reduce biases and potentially overlook or disadvantage certain marginalized groups within their classification processes. Researchers suggest that the positionality of annotators influences the gold standard labels in which the models learned from propagate annotators' bias. To further investigate the impact of annotator positionality, we delve into fine-tuning BERTweet and HateBERT on the dataset while using topic-modeling strategies for content moderation. The results indicate that fine-tuning the models on specific topics results in a notable improvement in the F1 score of the models when compared to the predictions generated by other prominent classification models such as GPT-4,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Biomedical Text Mining and Ontologies
MethodsAttention Is All You Need · Dense Connections · Label Smoothing · Dropout · Linear Layer · Layer Normalization · Byte Pair Encoding · Adam · Residual Connection · Softmax
