ToxiShield: Promoting Inclusive Developer Communication through Real-Time Toxicity Filtering

MD Awsaf Alam Anindya; Showvik Biswas; Anindya Iqbal; Jaydeb Sarker; Amiangshu Bosu

arXiv:2604.14408·cs.SE·April 17, 2026

ToxiShield: Promoting Inclusive Developer Communication through Real-Time Toxicity Filtering

MD Awsaf Alam Anindya, Showvik Biswas, Anindya Iqbal, Jaydeb Sarker, Amiangshu Bosu

PDF

TL;DR

ToxiShield is a real-time browser extension for GitHub that detects, explains, and reframes toxic comments in code reviews to promote inclusive and constructive developer communication.

Contribution

It introduces a multi-module system combining deep learning and large language models for toxicity detection, explanation, and rewriting in software engineering collaboration.

Findings

01

The BERT-based toxicity detector achieved 98% accuracy and 97% F1-score.

02

Claude 3.5 Sonnet effectively classifies toxicity with 42% F1.

03

Llama 3.2 generated 95.27% accurate, fluent, and content-preserving reframed comments.

Abstract

Toxic interactions during code reviews can undermine teamwork and hinder productivity in software engineering (SE) teams. While prior studies explore toxicity detection and empirical investigation, they lack real-time detoxification tools to support the SE community. To address this gap, we present ToxiShield, a browser extension for GitHub pull requests that is built using three modules: i) Toxicity Filter -- to identify whether a text is toxic, ii) Communication coach -- to facilitate just-in-time fine-grained toxicity categorization with explanations, and iii) The Reframer -- that generates a revised, constructive alternative of a toxic text. For each module, we trained and evaluated multiple deep learning and Large Language Models (LLMs) to identify the best choice. A BERT-based binary detection model, trained on 38,761 code review samples, achieves 98% accuracy and an F1-score of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.