Far Out: Evaluating Language Models on Slang in Australian and Indian English
Deniz Kaya Dilsiz, Dipankar Srirag, Aditya Joshi

TL;DR
This paper evaluates how well state-of-the-art language models understand slang in Indian and Australian English, revealing performance gaps and differences across datasets and tasks, especially in non-standard language varieties.
Contribution
It introduces two new slang datasets for Indian and Australian English and systematically assesses models' abilities across multiple tasks and datasets.
Findings
Models perform better on target word selection than prediction tasks.
Models show higher accuracy on web-sourced data compared to synthetic data.
Indian English slang tasks outperform Australian English in model accuracy.
Abstract
Language models exhibit systematic performance gaps when processing text in non-standard language varieties, yet their ability to comprehend variety-specific slang remains underexplored for several languages. We present a comprehensive evaluation of slang awareness in Indian English (en-IN) and Australian English (en-AU) across seven state-of-the-art language models. We construct two complementary datasets: WEB, containing 377 web-sourced usage examples from Urban Dictionary, and GEN, featuring 1,492 synthetically generated usages of these slang terms, across diverse scenarios. We assess language models on three tasks: target word prediction (TWP), guided target word prediction (TWP) and target word selection (TWS). Our results reveal four key findings: (1) Higher average model performance TWS versus TWP and TWP, with average accuracy score increasing from 0.03 to 0.49…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Swearing, Euphemism, Multilingualism · Digital Communication and Language
