Knowledge Condensation and Reasoning for Knowledge-based VQA
Dongze Hao, Jian Jia, Longteng Guo, Qunbo Wang, Te Yang, Yan Li,, Yanhua Cheng, Bo Wang, Quan Chen, Han Li, and Jing Liu

TL;DR
This paper introduces a novel approach for knowledge-based visual question answering that condenses external knowledge through multimodal perception and large language models, leading to state-of-the-art results without relying on GPT-3 generated knowledge.
Contribution
The paper proposes two synergistic models for knowledge condensation and reasoning, improving relevance and accuracy in KB-VQA tasks over previous methods.
Findings
Achieves 65.1% on OK-VQA and 60.1% on A-OKVQA datasets
Outperforms previous methods without GPT-3 knowledge
Demonstrates effective knowledge condensation and reasoning
Abstract
Knowledge-based visual question answering (KB-VQA) is a challenging task, which requires the model to leverage external knowledge for comprehending and answering questions grounded in visual content. Recent studies retrieve the knowledge passages from external knowledge bases and then use them to answer questions. However, these retrieved knowledge passages often contain irrelevant or noisy information, which limits the performance of the model. To address the challenge, we propose two synergistic models: Knowledge Condensation model and Knowledge Reasoning model. We condense the retrieved knowledge passages from two perspectives. First, we leverage the multimodal perception and reasoning ability of the visual-language models to distill concise knowledge concepts from retrieved lengthy passages, ensuring relevance to both the visual content and the question. Second, we leverage the text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuality Function Deployment in Product Design · Manufacturing Process and Optimization · Quality and Management Systems
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · {Dispute@FaQ-s}How to file a dispute with Expedia? · Softmax · Cosine Annealing · Layer Normalization · Dropout · Linear Layer · Multi-Head Attention
