Real-Time Toxicity Filtering for Open-Source Code Reviews

Md Awsaf Alam Anindya; Showvik Biswas; Anindya Iqbal; Jaydeb Sarker; Amiangshu Bosu

arXiv:2604.08886·cs.SE·April 13, 2026

Real-Time Toxicity Filtering for Open-Source Code Reviews

Md Awsaf Alam Anindya, Showvik Biswas, Anindya Iqbal, Jaydeb Sarker, Amiangshu Bosu

PDF

TL;DR

ToxiShield is a real-time browser extension that detects and detoxifies toxic code reviews in open-source projects, improving community collaboration.

Contribution

It introduces a multi-module framework combining toxicity detection, multiclass classification, and detoxification, with state-of-the-art models achieving high accuracy.

Findings

01

97% F1-score for toxicity identification

02

95.27% style transfer accuracy for detoxification

03

84% J-score indicating effective detoxification

Abstract

Toxic interactions in open-source software development harm community collaboration. To combat this, we propose ToxiShield, a realtime browser extension that identifies and detoxifies toxic code reviews. The framework comprises three modules: toxicity identification, reasoned multiclass classification, and code review detoxification. Our fine-tuned BERT-based binary classifier achieved a 97% F1-score on 38,761 code review texts. For multiclass classification, Claude 3.5 Sonnet with prompt engineering achieved a 39% MCC and 42% F1 on 1,200 samples. Finally, our fine-tuned Llama 3.2 detoxification model reached 95.27% style transfer accuracy, 97.03% fluency, 67.07% content preservation, and an 84% J-score. Validation with 10 software developers suggests ToxiShield effectively fosters a more inclusive open-source environment.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.