Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks

Munief Hassan Tahir; Sana Shams; Layba Fiaz; Farah Adeeba; Sarmad; Hussain

arXiv:2405.15453·cs.CL·January 31, 2025

Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks

Munief Hassan Tahir, Sana Shams, Layba Fiaz, Farah Adeeba, Sarmad, Hussain

PDF

Open Access

TL;DR

This paper benchmarks 7 large multilingual language models on 17 Urdu NLP tasks, revealing that models with richer language data often outperform larger models with less Urdu coverage, and SOTA models excel mainly in encoder-decoder architectures.

Contribution

It provides a comprehensive evaluation of prominent LLMs on Urdu NLP tasks, highlighting the importance of language-specific data over model size and comparing performance against SOTA models.

Findings

01

SOTA models outperform encoder-decoder models in most Urdu NLP tasks.

02

Llama 3.1-8B surpasses Llama 2-7B-Chat with better language coverage.

03

Smaller models with richer Urdu data often outperform larger models with less Urdu diversity.

Abstract

Large Language Models (LLMs) pre-trained on multilingual data have revolutionized natural language processing research, by transitioning from languages and task specific model pipelines to a single model adapted on a variety of tasks. However majority of existing multilingual NLP benchmarks for LLMs provide evaluation data in only few languages with little linguistic diversity. In addition these benchmarks lack quality assessment against the respective state-of the art models. This study presents an in-depth examination of 7 prominent LLMs: GPT-3.5-turbo, Llama 2-7B-Chat, Llama 3.1-8B, Bloomz 3B, Bloomz 7B1, Ministral-8B and Whisper (Large, medium and small variant) across 17 tasks using 22 datasets, 13.8 hours of speech, in a zero-shot setting, and their performance against state-of-the-art (SOTA) models, has been compared and analyzed. Our experiments show that SOTA models currently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · LLaMA · Adam · Dropout · Dense Connections · Softmax · BLOOMZ · Balanced Selection