Evaluating Dialect Robustness of Language Models via Conversation   Understanding

Dipankar Srirag; Nihar Ranjan Sahoo; Aditya Joshi

arXiv:2405.05688·cs.CL·December 13, 2024

Evaluating Dialect Robustness of Language Models via Conversation Understanding

Dipankar Srirag, Nihar Ranjan Sahoo, Aditya Joshi

PDF

Open Access 1 Repo

TL;DR

This paper assesses how well large language models handle different English dialects, revealing biases towards US English and demonstrating that fine-tuning with dialectal data can improve dialect understanding.

Contribution

Introduces a novel evaluation framework for dialect robustness using conversation datasets and extends MD3 to create M-MD3 for dialect-specific testing.

Findings

01

LLMs perform better on US English than Indian English.

02

GPT models outperform smaller models, but fine-tuning improves smaller models' dialect understanding.

03

Fine-tuning with dialectal data enhances LLMs' dialect comprehension.

Abstract

With an evergrowing number of LLMs reporting superlative performance for English, their ability to perform equitably for different dialects of English ( $i.e.$ , dialect robustness) needs to be ascertained. Specifically, we use English language (US English or Indian English) conversations between humans who play the word-guessing game of 'taboo'. We formulate two evaluative tasks: target word prediction (TWP) ( $i.e.$ , predict the masked target word in a conversation) and target word selection (TWS) ( $i.e.$ , select the most likely masked target word in a conversation, from among a set of candidate words). Extending MD3, an existing dialectic dataset of taboo-playing conversations, we introduce M-MD3, a target-word-masked version of MD3 with the en-US and en-IN subsets. We create two subsets: en-MV (where en-US is transformed to include dialectal information) and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dipankarsrirag/eval-dialect-robust
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification