Towards Transparent AI: A Survey on Explainable Large Language Models

Avash Palikhe; Zhenyu Yu; Zichong Wang; Wenbin Zhang

arXiv:2506.21812·cs.CL·June 30, 2025

Towards Transparent AI: A Survey on Explainable Large Language Models

Avash Palikhe, Zhenyu Yu, Zichong Wang, Wenbin Zhang

PDF

Open Access

TL;DR

This survey reviews explainability techniques for large language models, categorizing methods based on transformer architectures, evaluating their effectiveness, and discussing future challenges to enhance transparency and responsible AI deployment.

Contribution

It provides a systematic categorization and evaluation of XAI methods for LLMs, addressing a gap in understanding explainability techniques across different transformer architectures.

Findings

01

Categorizes XAI methods into encoder-only, decoder-only, and encoder-decoder models.

02

Analyzes evaluation metrics for explainability effectiveness.

03

Discusses practical applications and future research challenges.

Abstract

Large Language Models (LLMs) have played a pivotal role in advancing Artificial Intelligence (AI). However, despite their achievements, LLMs often struggle to explain their decision-making processes, making them a 'black box' and presenting a substantial challenge to explainability. This lack of transparency poses a significant obstacle to the adoption of LLMs in high-stakes domain applications, where interpretability is particularly essential. To overcome these limitations, researchers have developed various explainable artificial intelligence (XAI) methods that provide human-interpretable explanations for LLMs. However, a systematic understanding of these methods remains limited. To address this gap, this survey provides a comprehensive review of explainability techniques by categorizing XAI methods based on the underlying transformer architectures of LLMs: encoder-only, decoder-only,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education