Evaluating and Improving Robustness in Large Language Models: A Survey and Future Directions

Kun Zhang; Le Wu; Kui Yu; Guangyi Lv; Dacao Zhang

arXiv:2506.11111·cs.CL·July 10, 2025

Evaluating and Improving Robustness in Large Language Models: A Survey and Future Directions

Kun Zhang, Le Wu, Kui Yu, Guangyi Lv, Dacao Zhang

PDF

Open Access

TL;DR

This survey comprehensively reviews the robustness of Large Language Models, covering adversarial and out-of-distribution challenges, evaluation methods, and future research directions to enhance their reliability in diverse applications.

Contribution

It provides a formal definition of LLM robustness, organizes existing work by input perturbation types, and highlights future research opportunities in the field.

Findings

01

Organized robustness categories: adversarial, OOD, evaluation

02

Summarized new datasets and metrics for robustness assessment

03

Highlighted future directions for improving LLM reliability

Abstract

Large Language Models (LLMs) have gained enormous attention in recent years due to their capability of understanding and generating natural languages. With the rapid development and wild-range applications (e.g., Agents, Embodied Intelligence), the robustness of LLMs has received increased attention. As the core brain of many AI applications, the robustness of LLMs requires that models should not only generate consistent contents, but also ensure the correctness and stability of generated content when dealing with unexpeted application scenarios (e.g., toxic prompts, limited noise domain data, outof-distribution (OOD) applications, etc). In this survey paper, we conduct a thorough review of the robustness of LLMs, aiming to provide a comprehensive terminology of concepts and methods around this field and facilitate the community. Specifically, we first give a formal definition of LLM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling