Analyzing Syntactic Generalization Capacity of Pre-trained Language   Models on Japanese Honorific Conversion

Ryo Sekizawa; Hitomi Yanaka

arXiv:2306.03055·cs.CL·June 6, 2023·1 cites

Analyzing Syntactic Generalization Capacity of Pre-trained Language Models on Japanese Honorific Conversion

Ryo Sekizawa, Hitomi Yanaka

PDF

Open Access

TL;DR

This paper investigates whether GPT-3, a large pre-trained language model, can accurately perform Japanese honorific conversion considering social context, revealing strengths and limitations in its syntactic and contextual understanding.

Contribution

The study introduces a novel honorific conversion dataset and evaluates GPT-3's ability to handle social and syntactic nuances in Japanese honorifics through fine-tuning and prompt learning.

Findings

01

Fine-tuned GPT-3 outperforms prompt-based in honorific conversion.

02

GPT-3 shows strong syntactic generalization for compound sentences.

03

Performance drops with direct speech data.

Abstract

Using Japanese honorifics is challenging because it requires not only knowledge of the grammatical rules but also contextual information, such as social relationships. It remains unclear whether pre-trained large language models (LLMs) can flexibly handle Japanese honorifics like humans. To analyze this, we introduce an honorific conversion task that considers social relationships among people mentioned in a conversation. We construct a Japanese honorifics dataset from problem templates of various sentence structures to investigate the syntactic generalization capacity of GPT-3, one of the leading LLMs, on this task under two settings: fine-tuning and prompt learning. Our results showed that the fine-tuned GPT-3 performed better in a context-aware honorific conversion task than the prompt-based one. The fine-tuned model demonstrated overall syntactic generalizability towards compound…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Cosine Annealing · Adam · Residual Connection · Dropout · Attention Dropout