Toxicity-Aware Few-Shot Prompting for Low-Resource Singlish Translation
Ziyu Ge, Gabriel Chua, Leanne Tan, Roy Ka-Wei Lee

TL;DR
This paper introduces a two-stage, toxicity-aware translation framework for low-resource, code-mixed Singlish, combining human-verified prompt engineering and model benchmarking to improve translation quality and safety.
Contribution
It presents a novel reproducible pipeline for toxicity-preserving translation in low-resource languages using few-shot prompting and model benchmarking.
Findings
Effective translation of Singlish with preserved slang and toxicity nuances.
Human evaluation confirms improved translation quality and safety.
Framework supports culturally sensitive moderation in low-resource settings.
Abstract
As online communication increasingly incorporates under-represented languages and colloquial dialects, standard translation systems often fail to preserve local slang, code-mixing, and culturally embedded markers of harmful speech. Translating toxic content between low-resource language pairs poses additional challenges due to scarce parallel data and safety filters that sanitize offensive expressions. In this work, we propose a reproducible, two-stage framework for toxicity-preserving translation, demonstrated on a code-mixed Singlish safety corpus. First, we perform human-verified few-shot prompt engineering: we iteratively curate and rank annotator-selected Singlish-target examples to capture nuanced slang, tone, and toxicity. Second, we optimize model-prompt pairs by benchmarking several large language models using semantic similarity via direct and back-translation. Quantitative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
