Knowledge Crosswords: Geometric Knowledge Reasoning with Large Language   Models

Wenxuan Ding; Shangbin Feng; Yuhan Liu; Zhaoxuan Tan; Vidhisha; Balachandran; Tianxing He; Yulia Tsvetkov

arXiv:2310.01290·cs.CL·June 26, 2024·1 cites

Knowledge Crosswords: Geometric Knowledge Reasoning with Large Language Models

Wenxuan Ding, Shangbin Feng, Yuhan Liu, Zhaoxuan Tan, Vidhisha, Balachandran, Tianxing He, Yulia Tsvetkov

PDF

Open Access 1 Repo 1 Video

TL;DR

Knowledge Crosswords introduces a geometric knowledge reasoning benchmark that challenges large language models to infer missing facts within structured knowledge networks, highlighting their limitations and proposing new methods to improve reasoning robustness.

Contribution

The paper presents a novel geometric knowledge reasoning benchmark, analyzes LLM limitations, and introduces two new approaches, Staged Prompting and Verify-All, to enhance reasoning capabilities.

Findings

01

Baseline LLMs struggle with larger networks and distractors.

02

Verify-All approach outperforms prior methods in robustness.

03

Geometric reasoning poses new challenges for LLMs' knowledge abilities.

Abstract

We propose Knowledge Crosswords, a geometric knowledge reasoning benchmark consisting of incomplete knowledge networks bounded by structured factual constraints, where LLMs are tasked with inferring the missing facts to meet all constraints. The novel setting of geometric knowledge reasoning necessitates new LM abilities beyond existing atomic/linear multi-hop QA, such as backtracking, verifying facts and constraints, reasoning with uncertainty, and more. Knowledge Crosswords contains 2,101 individual problems, covering diverse knowledge domains, and is further divided into three difficulty levels. We conduct extensive experiments to evaluate existing LLMs and approaches on Knowledge Crosswords. Results demonstrate that baseline approaches struggle with larger knowledge networks and semantically-equivalent entity distractors. In light of their limitations, we propose two new approaches,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wenwen-d/knowledgecrosswords
noneOfficial

Videos

Knowledge Crosswords: Geometric Knowledge Reasoning with Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification