Genshin: General Shield for Natural Language Processing with Large   Language Models

Xiao Peng; Tao Liu; Ying Wang

arXiv:2405.18741·cs.CL·June 4, 2024

Genshin: General Shield for Natural Language Processing with Large Language Models

Xiao Peng, Tao Liu, Ying Wang

PDF

Open Access

TL;DR

Genshin is a novel framework that leverages large language models as a one-time plug-in to recover original text and enhance interpretability and robustness in NLP tasks like sentiment analysis and spam detection.

Contribution

The paper introduces Genshin, a cascading framework that uses LLMs for text recovery, improving interpretability and robustness against adversarial attacks in NLP applications.

Findings

01

Genshin effectively recovers original text with high accuracy.

02

It exposes vulnerabilities of median models to adversarial attacks.

03

Demonstrates improved robustness and interpretability in NLP tasks.

Abstract

Large language models (LLMs) like ChatGPT, Gemini, or LLaMA have been trending recently, demonstrating considerable advancement and generalizability power in countless domains. However, LLMs create an even bigger black box exacerbating opacity, with interpretability limited to few approaches. The uncertainty and opacity embedded in LLMs' nature restrict their application in high-stakes domains like financial fraud, phishing, etc. Current approaches mainly rely on traditional textual classification with posterior interpretable algorithms, suffering from attackers who may create versatile adversarial samples to break the system's defense, forcing users to make trade-offs between efficiency and robustness. To address this issue, we propose a novel cascading framework called Genshin (General Shield for Natural Language Processing with Large Language Models), utilizing LLMs as defensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsLLaMA