Irony Detection in Urdu Text: A Comparative Study Using Machine Learning Models and Large Language Models

Fiaz Ahmad; Nisar Hussain; Amna Qasim; Momina Hafeez; Muhammad Usman Grigori Sidorov; Alexander Gelbukh

arXiv:2510.22356·cs.CL·October 28, 2025

Irony Detection in Urdu Text: A Comparative Study Using Machine Learning Models and Large Language Models

Fiaz Ahmad, Nisar Hussain, Amna Qasim, Momina Hafeez, Muhammad Usman Grigori Sidorov, Alexander Gelbukh

PDF

TL;DR

This study compares traditional machine learning models and large language models for irony detection in Urdu, demonstrating that fine-tuned transformer models, especially LLaMA 3, achieve high accuracy in a low-resource language context.

Contribution

It introduces a novel Urdu irony detection approach by translating an English corpus and evaluates multiple models, highlighting the effectiveness of large-scale transformers in low-resource NLP tasks.

Findings

01

Gradient Boosting achieved 89.18% F1-score

02

LLaMA 3 (8B) achieved 94.61% F1-score

03

Transformers outperform classical models in Urdu irony detection

Abstract

Ironic identification is a challenging task in Natural Language Processing, particularly when dealing with languages that differ in syntax and cultural context. In this work, we aim to detect irony in Urdu by translating an English Ironic Corpus into the Urdu language. We evaluate ten state-of-the-art machine learning algorithms using GloVe and Word2Vec embeddings, and compare their performance with classical methods. Additionally, we fine-tune advanced transformer-based models, including BERT, RoBERTa, LLaMA 2 (7B), LLaMA 3 (8B), and Mistral, to assess the effectiveness of large-scale models in irony detection. Among machine learning models, Gradient Boosting achieved the best performance with an F1-score of 89.18%. Among transformer-based models, LLaMA 3 (8B) achieved the highest performance with an F1-score of 94.61%. These results demonstrate that combining transliteration…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.