Explicit Logic Channel for Validation and Enhancement of MLLMs on Zero-Shot Tasks
Mei Chee Leong, Ying Gu, Hui Li Tan, Liyuan Li, Nancy Chen

TL;DR
This paper introduces an Explicit Logic Channel to validate, select, and enhance Multimodal Large Language Models in zero-shot visual-language tasks, improving trustworthiness and explainability.
Contribution
The paper proposes a novel Explicit Logic Channel that mimics human reasoning, enabling validation and enhancement of MLLMs without ground-truth labels.
Findings
Explicit Logic Channel improves model validation and selection.
Cross-channel integration enhances zero-shot task performance.
Proposed methods increase trustworthiness and explainability of MLLMs.
Abstract
Frontier Multimodal Large Language Models (MLLMs) exhibit remarkable capabilities in Visual-Language Comprehension (VLC) tasks. However, they are often deployed as zero-shot solution to new tasks in a black-box manner. Validating and understanding the behavior of these models become important for application to new task. We propose an Explicit Logic Channel, in parallel with the black-box model channel, to perform explicit logical reasoning for model validation, selection and enhancement. The frontier MLLM, encapsulating latent vision-language knowledge, can be considered as an Implicit Logic Channel. The proposed Explicit Logic Channel, mimicking human logical reasoning, incorporates a LLM, a VFM, and logical reasoning with probabilistic inference for factual, counterfactual, and relational reasoning over the explicit visual evidence. A Consistency Rate (CR) is proposed for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
