Seeing Isn't Orienting: A Cognitively Grounded Benchmark Reveals Systematic Orientation Failures in MLLMs

Nazia Tasnim; Keanu Nichols; Yuting Yan; Nicholas Ikechukwu; Elva Zou; Deepti Ghadiyaram; Bryan A. Plummer

arXiv:2505.21649·cs.CV·April 28, 2026

Seeing Isn't Orienting: A Cognitively Grounded Benchmark Reveals Systematic Orientation Failures in MLLMs

Nazia Tasnim, Keanu Nichols, Yuting Yan, Nicholas Ikechukwu, Elva Zou, Deepti Ghadiyaram, Bryan A. Plummer

PDF

1 Repo 1 Datasets

TL;DR

The paper introduces DORI, a new benchmark to evaluate object orientation understanding in vision-language models, revealing significant limitations in current systems' ability to perceive and reason about object orientations.

Contribution

DORI is the first comprehensive diagnostic benchmark specifically designed to assess orientation perception in multimodal systems, highlighting their systematic failures.

Findings

01

Current models achieve only 54.2% accuracy on coarse orientation tasks.

02

Models perform poorly on tasks requiring reference frame shifts or compound rotations.

03

Models show systematic inability to estimate angles and track orientation changes.

Abstract

Object orientation understanding represents a fundamental challenge in visual perception critical for applications like robotic manipulation and augmented reality. Current vision-language benchmarks fail to isolate this capability, often conflating it with positional relationships and general scene understanding. We introduce DORI (Discriminative Orientation Reasoning Intelligence), a comprehensive benchmark establishing object orientation perception as a primary evaluation target. DORI assesses four dimensions of orientation comprehension: frontal alignment, rotational transformations, relative directional relationships, and canonical orientation understanding. Through carefully curated tasks from 11 datasets spanning 67 object categories across synthetic and real-world scenarios, DORI provides insights on how multi-modal systems understand object orientations. Our evaluation of 15…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://huggingface.co/datasets/appledora/DORI-Benchmark
github

Datasets

appledora/DORI-Benchmark
dataset· 288 dl
288 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.