Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks
Munief Hassan Tahir, Sana Shams, Layba Fiaz, Farah Adeeba, Sarmad, Hussain

TL;DR
This paper benchmarks 7 large multilingual language models on 17 Urdu NLP tasks, revealing that models with richer language data often outperform larger models with less Urdu coverage, and SOTA models excel mainly in encoder-decoder architectures.
Contribution
It provides a comprehensive evaluation of prominent LLMs on Urdu NLP tasks, highlighting the importance of language-specific data over model size and comparing performance against SOTA models.
Findings
SOTA models outperform encoder-decoder models in most Urdu NLP tasks.
Llama 3.1-8B surpasses Llama 2-7B-Chat with better language coverage.
Smaller models with richer Urdu data often outperform larger models with less Urdu diversity.
Abstract
Large Language Models (LLMs) pre-trained on multilingual data have revolutionized natural language processing research, by transitioning from languages and task specific model pipelines to a single model adapted on a variety of tasks. However majority of existing multilingual NLP benchmarks for LLMs provide evaluation data in only few languages with little linguistic diversity. In addition these benchmarks lack quality assessment against the respective state-of the art models. This study presents an in-depth examination of 7 prominent LLMs: GPT-3.5-turbo, Llama 2-7B-Chat, Llama 3.1-8B, Bloomz 3B, Bloomz 7B1, Ministral-8B and Whisper (Large, medium and small variant) across 17 tasks using 22 datasets, 13.8 hours of speech, in a zero-shot setting, and their performance against state-of-the-art (SOTA) models, has been compared and analyzed. Our experiments show that SOTA models currently…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · LLaMA · Adam · Dropout · Dense Connections · Softmax · BLOOMZ · Balanced Selection
