An Investigation of LLMs' Inefficacy in Understanding Converse Relations

Chengwen Qi; Bowen Li; Binyuan Hui; Bailin Wang; Jinyang Li; Jinwang; Wu; Yuanjun Laili

arXiv:2310.05163·cs.CL·November 14, 2023

An Investigation of LLMs' Inefficacy in Understanding Converse Relations

Chengwen Qi, Bowen Li, Binyuan Hui, Bailin Wang, Jinyang Li, Jinwang, Wu, Yuanjun Laili

PDF

Open Access 1 Repo

TL;DR

This paper evaluates whether large language models truly understand structured semantics by testing them on a new benchmark for converse relations, revealing limitations and tendencies for shortcut learning.

Contribution

Introduces ConvRe, a novel benchmark for assessing LLMs' understanding of converse relations in formal language, with comprehensive evaluation protocols and analysis.

Findings

01

LLMs show limited understanding of converse relations.

02

Models tend to rely on shortcut learning rather than genuine comprehension.

03

Scaling improves performance but does not fully solve the understanding challenge.

Abstract

Large Language Models (LLMs) have achieved remarkable success in many formal language oriented tasks, such as structural data-to-text and semantic parsing. However current benchmarks mostly follow the data distribution of the pre-training data of LLMs. Therefore, a natural question rises that do LLMs really understand the structured semantics of formal languages. In this paper, we investigate this problem on a special case, converse binary relation. We introduce a new benchmark ConvRe focusing on converse relations, which contains 17 relations and 1240 triples extracted from popular knowledge graph completion datasets. Our ConvRE features two tasks, Re2Text and Text2Re, which are formulated as multi-choice question answering to evaluate LLMs' ability to determine the matching between relations and associated text. For the evaluation protocol, apart from different prompting methods, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

3b-group/convre
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification