Unveiling Narrative Reasoning Limits of Large Language Models with Trope   in Movie Synopses

Hung-Ting Su; Ya-Ching Hsu; Xudong Lin; Xiang-Qian Shi; Yulei Niu,; Han-Yuan Hsu; Hung-yi Lee; Winston H. Hsu

arXiv:2409.14324·cs.CL·September 24, 2024

Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses

Hung-Ting Su, Ya-Ching Hsu, Xudong Lin, Xiang-Qian Shi, Yulei Niu,, Han-Yuan Hsu, Hung-yi Lee, Winston H. Hsu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper evaluates the narrative reasoning abilities of large language models using movie synopses and tropes, revealing limitations and proposing methods to improve reasoning accuracy and robustness.

Contribution

It introduces a trope-wise querying approach and an Adversarial Injection method to assess and enhance LLMs' narrative reasoning capabilities.

Findings

01

LLMs perform poorly on narrative reasoning with movie tropes.

02

Chain-of-thought prompting can cause hallucinations in narrative content.

03

Adversarial Injection reveals increased sensitivity of CoT to trope-related text.

Abstract

Large language models (LLMs) equipped with chain-of-thoughts (CoT) prompting have shown significant multi-step reasoning capabilities in factual content like mathematics, commonsense, and logic. However, their performance in narrative reasoning, which demands greater abstraction capabilities, remains unexplored. This study utilizes tropes in movie synopses to assess the abstract reasoning abilities of state-of-the-art LLMs and uncovers their low performance. We introduce a trope-wise querying approach to address these challenges and boost the F1 score by 11.8 points. Moreover, while prior studies suggest that CoT enhances multi-step reasoning, this study shows CoT can cause hallucinations in narrative content, reducing GPT-4's performance. We also introduce an Adversarial Injection method to embed trope-related text tokens into movie synopses without explicit tropes, revealing CoT's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shelley1214/trope
noneOfficial

Videos

Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods