'Beach' to 'Bitch': Inadvertent Unsafe Transcription of Kids' Content on YouTube
Krithika Ramesh, Ashiqur R. KhudaBukhsh, Sumeet Kumar

TL;DR
This paper uncovers that popular speech recognition systems can inadvertently generate highly inappropriate text for children when transcribing YouTube Kids videos, raising safety concerns and highlighting the need for improved safeguards.
Contribution
It introduces the concept of 'inappropriate content hallucination' in ASR systems, provides a dataset of such errors, and explores potential fixes using language models.
Findings
ASR systems often produce inappropriate content with high confidence.
A new dataset of audios with hallucinated inappropriate content is released.
Language models can mitigate some of the hallucination errors.
Abstract
Over the last few years, YouTube Kids has emerged as one of the highly competitive alternatives to television for children's entertainment. Consequently, YouTube Kids' content should receive an additional level of scrutiny to ensure children's safety. While research on detecting offensive or inappropriate content for kids is gaining momentum, little or no current work exists that investigates to what extent AI applications can (accidentally) introduce content that is inappropriate for kids. In this paper, we present a novel (and troubling) finding that well-known automatic speech recognition (ASR) systems may produce text content highly inappropriate for kids while transcribing YouTube Kids' videos. We dub this phenomenon as \emph{inappropriate content hallucination}. Our analyses suggest that such hallucinations are far from occasional, and the ASR systems often produce them with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Adversarial Robustness in Machine Learning · Music and Audio Processing
