RECAST: Enabling User Recourse and Interpretability of Toxicity Detection Models with Interactive Visualization
Austin P Wright, Omar Shaikh, Haekyu Park, Will Epperson, Muhammed, Ahmed, Stephane Pinel, Duen Horng Chau, Diyi Yang

TL;DR
RECAST is an interactive visualization tool that helps users understand and modify toxic language predictions from NLP models, promoting transparency, user recourse, and improved moderation effectiveness.
Contribution
It introduces an open-source web tool that visualizes toxicity predictions, offers alternative phrasings, and evaluates user interactions to enhance model transparency and user agency.
Findings
RECAST effectively helps users reduce detected toxicity.
Users gain better understanding of toxicity criteria.
Models become less effective when users optimize language for them.
Abstract
With the widespread use of toxic language online, platforms are increasingly using automated systems that leverage advances in natural language processing to automatically flag and remove toxic comments. However, most automated systems -- when detecting and moderating toxic language -- do not provide feedback to their users, let alone provide an avenue of recourse for these users to make actionable changes. We present our work, RECAST, an interactive, open-sourced web tool for visualizing these models' toxic predictions, while providing alternative suggestions for flagged toxic language. Our work also provides users with a new path of recourse when using these automated moderation tools. RECAST highlights text responsible for classifying toxicity, and allows users to interactively substitute potentially toxic phrases with neutral alternatives. We examined the effect of RECAST via two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
