Sensitivity and Robustness of Large Language Models to Prompt Template   in Japanese Text Classification Tasks

Chengguang Gan; Tatsunori Mori

arXiv:2305.08714·cs.CL·June 9, 2023·5 cites

Sensitivity and Robustness of Large Language Models to Prompt Template in Japanese Text Classification Tasks

Chengguang Gan, Tatsunori Mori

PDF

Open Access

TL;DR

This paper evaluates the sensitivity and robustness of large language models, especially GPT-4, to prompt template variations in Japanese text classification, revealing significant stability issues and performance drops.

Contribution

It provides a comprehensive analysis of how prompt template modifications affect LLM performance in Japanese, highlighting stability issues in current models like GPT-4.

Findings

01

GPT-4 accuracy drops from 49.21% to 25.44% with prompt variation

02

Large language models show significant sensitivity to prompt structure in Japanese

03

Current models face stability challenges in multilingual prompt tasks

Abstract

Prompt engineering relevance research has seen a notable surge in recent years, primarily driven by advancements in pre-trained language models and large language models. However, a critical issue has been identified within this domain: the inadequate of sensitivity and robustness of these models towards Prompt Templates, particularly in lesser-studied languages such as Japanese. This paper explores this issue through a comprehensive evaluation of several representative Large Language Models (LLMs) and a widely-utilized pre-trained model(PLM). These models are scrutinized using a benchmark dataset in Japanese, with the aim to assess and analyze the performance of the current multilingual models in this context. Our experimental results reveal startling discrepancies. A simple modification in the sentence structure of the Prompt Template led to a drastic drop in the accuracy of GPT-4…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsMulti-Head Attention · Attention Is All You Need · Position-Wise Feed-Forward Layer · Adam · Absolute Position Encodings · Adafactor · Softmax · Layer Normalization · Inverse Square Root Schedule · Byte Pair Encoding