ChatGPT Evaluation on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations
Chunkit Chan, Jiayang Cheng, Weiqi Wang, Yuxin Jiang, Tianqing Fang,, Xin Liu, Yangqiu Song

TL;DR
This study systematically evaluates ChatGPT's ability to understand and classify inter-sentential relations, revealing strengths in causal and discourse relations but challenges with temporal and implicit discourse understanding.
Contribution
It provides a comprehensive, multi-dataset evaluation of ChatGPT's performance on sentence relations using various prompt templates, establishing baseline scores for relation classification tasks.
Findings
ChatGPT excels at causal relation detection.
Struggles with temporal order identification.
Has difficulty with implicit discourse relations and dialogue parsing.
Abstract
This paper aims to quantitatively evaluate the performance of ChatGPT, an interactive large language model, on inter-sentential relations such as temporal relations, causal relations, and discourse relations. Given ChatGPT's promising performance across various tasks, we proceed to carry out thorough evaluations on the whole test sets of 11 datasets, including temporal and causal relations, PDTB2.0-based, and dialogue-based discourse relations. To ensure the reliability of our findings, we employ three tailored prompt templates for each task, including the zero-shot prompt template, zero-shot prompt engineering (PE) template, and in-context learning (ICL) prompt template, to establish the initial baseline scores for all popular sentence-pair relation classification tasks for the first time. Through our study, we discover that ChatGPT exhibits exceptional proficiency in detecting and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Mental Health via Writing · Computational and Text Analysis Methods
MethodsTest · Attentive Walk-Aggregating Graph Neural Network
