Explicit Logic Channel for Validation and Enhancement of MLLMs on Zero-Shot Tasks

Mei Chee Leong; Ying Gu; Hui Li Tan; Liyuan Li; Nancy Chen

arXiv:2603.11689·cs.AI·May 19, 2026

Explicit Logic Channel for Validation and Enhancement of MLLMs on Zero-Shot Tasks

Mei Chee Leong, Ying Gu, Hui Li Tan, Liyuan Li, Nancy Chen

PDF

TL;DR

This paper introduces an Explicit Logic Channel to validate, select, and enhance Multimodal Large Language Models in zero-shot visual-language tasks, improving trustworthiness and explainability.

Contribution

The paper proposes a novel Explicit Logic Channel that mimics human reasoning, enabling validation and enhancement of MLLMs without ground-truth labels.

Findings

01

Explicit Logic Channel improves model validation and selection.

02

Cross-channel integration enhances zero-shot task performance.

03

Proposed methods increase trustworthiness and explainability of MLLMs.

Abstract

Frontier Multimodal Large Language Models (MLLMs) exhibit remarkable capabilities in Visual-Language Comprehension (VLC) tasks. However, they are often deployed as zero-shot solution to new tasks in a black-box manner. Validating and understanding the behavior of these models become important for application to new task. We propose an Explicit Logic Channel, in parallel with the black-box model channel, to perform explicit logical reasoning for model validation, selection and enhancement. The frontier MLLM, encapsulating latent vision-language knowledge, can be considered as an Implicit Logic Channel. The proposed Explicit Logic Channel, mimicking human logical reasoning, incorporates a LLM, a VFM, and logical reasoning with probabilistic inference for factual, counterfactual, and relational reasoning over the explicit visual evidence. A Consistency Rate (CR) is proposed for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications