Empirical Study of Symmetrical Reasoning in Conversational Chatbots

Daniela N. Rim; Heeyoul Choi

arXiv:2407.05734·cs.CL·July 9, 2024

Empirical Study of Symmetrical Reasoning in Conversational Chatbots

Daniela N. Rim, Heeyoul Choi

PDF

Open Access

TL;DR

This study evaluates how well large language models can understand predicate symmetry in conversation, revealing varied performance and highlighting both potential and limitations in their cognitive reasoning abilities.

Contribution

It provides the first empirical assessment of LLMs' ability to perform symmetrical reasoning using the SIS dataset and in-context learning.

Findings

01

Gemini achieves a correlation of 0.85 with human judgments.

02

Chatbots show varied performance, with some nearing human-like reasoning.

03

The study highlights both potentials and limitations of LLMs in cognitive tasks.

Abstract

This work explores the capability of conversational chatbots powered by large language models (LLMs), to understand and characterize predicate symmetry, a cognitive linguistic function traditionally believed to be an inherent human trait. Leveraging in-context learning (ICL), a paradigm shift enabling chatbots to learn new tasks from prompts without re-training, we assess the symmetrical reasoning of five chatbots: ChatGPT 4, Huggingface chat AI, Microsoft's Copilot AI, LLaMA through Perplexity, and Gemini Advanced. Using the Symmetry Inference Sentence (SIS) dataset by Tanchip et al. (2020), we compare chatbot responses against human evaluations to gauge their understanding of predicate symmetry. Experiment results reveal varied performance among chatbots, with some approaching human-like reasoning capabilities. Gemini, for example, reaches a correlation of 0.85 with human scores,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in Service Interactions

MethodsLLaMA