VIRO: Robust and Efficient Neuro-Symbolic Reasoning with Verification for Referring Expression Comprehension
Hyejin Park, Junhyuk Kwon, Suha Kwak, Jungseul Ok

TL;DR
VIRO introduces verification-integrated reasoning operators in neuro-symbolic models for referring expression comprehension, significantly improving robustness and accuracy, especially in no-target scenarios, while maintaining efficiency and scalability.
Contribution
The paper proposes VIRO, a novel neuro-symbolic framework with embedded verifiers that enhance robustness and reduce cascading errors in referring expression comprehension.
Findings
Achieves 61.1% balanced accuracy in target and no-target detection.
Demonstrates low program failure rate of 0.3%.
Generalizes effectively to real-world egocentric data.
Abstract
Referring Expression Comprehension (REC) aims to localize the image region corresponding to a natural language query. Recent neuro-symbolic REC approaches leverage large language models (LLMs) and vision-language models (VLMs) to perform compositional reasoning, decomposing queries into structured programs and executing them step-by-step. While such approaches achieve interpretable reasoning and strong zero-shot generalization, they assume that intermediate reasoning steps are accurate. However, this assumption causes cascading errors: false detections and invalid relations propagate through the reasoning chain, yielding high-confidence false positives even when no target is present in the image. To address this limitation, we introduce Verification-Integrated Reasoning Operators (VIRO), a neuro-symbolic framework that embeds lightweight operator-level verifiers within reasoning steps.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Ferroelectric and Negative Capacitance Devices
