Declarative Knowledge Distillation from Large Language Models for Visual   Question Answering Datasets

Thomas Eiter; Jan Hadl; Nelson Higuera; Johannes Oetsch

arXiv:2410.09428·cs.AI·October 15, 2024

Declarative Knowledge Distillation from Large Language Models for Visual Question Answering Datasets

Thomas Eiter, Jan Hadl, Nelson Higuera, Johannes Oetsch

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method for extracting and refining reasoning rules from large language models to improve interpretability and performance in visual question answering tasks, demonstrated on CLEVR and GQA datasets.

Contribution

The paper presents a novel approach for declarative knowledge distillation from LLMs to generate reasoning rules for VQA, reducing manual rule crafting.

Findings

01

Effective rule extension and validation using LLMs and ASP solvers.

02

Improved VQA performance on CLEVR and GQA datasets.

03

Knowledge distillation from LLMs is a promising alternative to data-driven rule learning.

Abstract

Visual Question Answering (VQA) is the task of answering a question about an image and requires processing multimodal input and reasoning to obtain the answer. Modular solutions that use declarative representations within the reasoning component have a clear advantage over end-to-end trained systems regarding interpretability. The downside is that crafting the rules for such a component can be an additional burden on the developer. We address this challenge by presenting an approach for declarative knowledge distillation from Large Language Models (LLMs). Our method is to prompt an LLM to extend an initial theory on VQA reasoning, given as an answer-set program, to meet the requirements of the VQA task. Examples from the VQA dataset are used to guide the LLM, validate the results, and mend rules if they are not correct by using feedback from the ASP solver. We demonstrate that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pudumagico/kr2024
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Graph Neural Networks · Topic Modeling

MethodsMODEL EDITOR NETWORKS WITH GRADIENT DECOMPOSITION · Knowledge Distillation