LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual   Contexts

Yijia Xiao; Edward Sun; Tianyu Liu; Wei Wang

arXiv:2407.04973·cs.AI·July 9, 2024·2 cites

LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts

Yijia Xiao, Edward Sun, Tianyu Liu, Wei Wang

PDF

Open Access 1 Repo

TL;DR

LogicVista is a comprehensive benchmark designed to evaluate the logical reasoning abilities of multimodal large language models in visual contexts, addressing a gap in systematic assessment of their reasoning skills.

Contribution

It introduces a new benchmark with diverse logical tasks, annotated questions, and a thorough evaluation of 8 MLLMs' reasoning capabilities in visual scenarios.

Findings

01

MLLMs show varied performance across tasks

02

Benchmark reveals strengths and weaknesses in logical reasoning

03

Provides a standardized dataset for future research

Abstract

We propose LogicVista, an evaluation benchmark that assesses the integrated logical reasoning capabilities of multimodal large language models (MLLMs) in Visual contexts. Recent advancements in MLLMs have demonstrated various fascinating abilities, from crafting poetry based on an image to performing mathematical reasoning. However, there is still a lack of systematic evaluation of MLLMs' proficiency in logical reasoning tasks, which are essential for activities like navigation and puzzle-solving. Thus we evaluate general logical cognition abilities across 5 logical reasoning tasks encompassing 9 different capabilities, using a sample of 448 multiple-choice questions. Each question is annotated with the correct answer and the human-written reasoning behind the selection, enabling both open-ended and multiple-choice evaluation. A total of 8 MLLMs are comprehensively evaluated using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yijia-xiao/logicvista
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques