# Explainable detection: a transformer-based language modeling approach for Bengali news title classification with comparative explainability analysis using ML and DL

**Authors:** Md. Julkar Naeen, Sourav Kumar Das, Sakib Alam Jisan, Sharun Akter Khushbu, Noyon Chandra Saha, Ohidujjaman

PMC · DOI: 10.3389/frai.2025.1537432 · 2025-11-06

## TL;DR

This paper explores using transformer models for classifying Bengali news titles, comparing them with ML and LSTM models, and emphasizes explainability in AI for low-resource languages.

## Contribution

The study introduces transformer-based models for Bengali text classification and integrates explainable AI techniques to improve transparency.

## Key findings

- XLM-RoBERTa Base achieved the highest accuracy of 0.91 in classifying Bengali news titles.
- Explainable AI techniques like LIME were used to identify key features influencing classification outcomes.
- Transformer models outperformed traditional ML and LSTM models in Bengali text classification.

## Abstract

Classifying scattered Bengali text is the primary focus of this study, with an emphasis on explainability in Natural Language Processing (NLP) for low-resource languages. We employed supervised Machine Learning (ML) models as a baseline and compared their performance with Long Short-Term Memory (LSTM) networks from the deep learning domain. Subsequently, we implemented transformer models designed for sequential learning. To prepare the dataset, we collected recent Bengali news articles online and performed extensive feature engineering. Given the inherent noise in Bengali datasets, significant preprocessing was required. Among the models tested, XLM-RoBERTa Base achieved the highest accuracy 0.91. Furthermore, we integrated explainable AI techniques to interpret the model’s predictions, enhancing transparency and fostering trust in the classification outcomes. Additionally, we employed LIME (Local Interpretable Model-agnostic Explanations) to identify key features and the most weighted words responsible for classifying news titles, which validated the accuracy of Bengali news classification results. This study underscores the potential of deep learning models in advancing text classification for the Bengali language and emphasizes the critical role of explainability in AI-driven solutions.

## Full-text entities

- **Diseases:** LIME (MESH:D004195), DL (MESH:D007859), XAI (MESH:C538243), LSTM (MESH:D000088562)
- **Chemicals:** BERT (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12631608/full.md

---
Source: https://tomesphere.com/paper/PMC12631608