KGPA: Robustness Evaluation for Large Language Models via Cross-Domain   Knowledge Graphs

Aihua Pei (1); Zehua Yang (1); Shunan Zhu (1); Ruoxi Cheng (2); Ju Jia; (2); Lina Wang (3) ((1) Waseda University; (2) Southeast University; (3); Wuhan University)

arXiv:2406.10802·cs.CL·June 18, 2024

KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

Aihua Pei (1), Zehua Yang (1), Shunan Zhu (1), Ruoxi Cheng (2), Ju Jia, (2), Lina Wang (3) ((1) Waseda University, (2) Southeast University, (3), Wuhan University)

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel framework that uses knowledge graphs to systematically evaluate the adversarial robustness of large language models across various professional domains, addressing limitations of existing benchmark-dependent methods.

Contribution

It presents a new approach leveraging knowledge graphs to generate adversarial prompts, enabling more comprehensive robustness evaluation of LLMs in diverse domains.

Findings

01

ChatGPT variants show different robustness levels, with GPT-4-turbo being the most robust.

02

Robustness varies significantly across professional domains.

03

The framework effectively identifies vulnerabilities of LLMs under adversarial attacks.

Abstract

Existing frameworks for assessing robustness of large language models (LLMs) overly depend on specific benchmarks, increasing costs and failing to evaluate performance of LLMs in professional domains due to dataset limitations. This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). Our framework generates original prompts from the triplets of knowledge graphs and creates adversarial prompts by poisoning, assessing the robustness of LLMs through the results of these adversarial attacks. We systematically evaluate the effectiveness of this framework and its modules. Experiments show that adversarial robustness of the ChatGPT family ranks as GPT-4-turbo > GPT-4o > GPT-3.5-turbo, and the robustness of large language models is influenced by the professional domains in which they operate.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aika-wsd/KGPA
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Cosine Annealing · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Adam · Attention Dropout · Weight Decay