Enhancing Food-Domain Question Answering with a Multimodal Knowledge Graph: Hybrid QA Generation and Diversity Analysis

Srihari K B; Pushpak Bhattacharyya

arXiv:2507.06571·cs.CL·July 10, 2025

Enhancing Food-Domain Question Answering with a Multimodal Knowledge Graph: Hybrid QA Generation and Diversity Analysis

Srihari K B, Pushpak Bhattacharyya

PDF

Open Access

TL;DR

This paper introduces a multimodal knowledge graph-based framework for food question answering, combining structured data and generative AI to improve accuracy, diversity, and fidelity in food-related queries and answers.

Contribution

It presents a novel unified food-domain QA system integrating a large-scale multimodal knowledge graph with generative models, enhancing answer quality and diversity.

Findings

01

Improved BERTScore by 16.2%

02

Reduced FID by 37.8%

03

Achieved 94.1% image reuse accuracy

Abstract

We propose a unified food-domain QA framework that combines a large-scale multimodal knowledge graph (MMKG) with generative AI. Our MMKG links 13,000 recipes, 3,000 ingredients, 140,000 relations, and 14,000 images. We generate 40,000 QA pairs using 40 templates and LLaVA/DeepSeek augmentation. Joint fine-tuning of Meta LLaMA 3.1-8B and Stable Diffusion 3.5-Large improves BERTScore by 16.2\%, reduces FID by 37.8\%, and boosts CLIP alignment by 31.1\%. Diagnostic analyses-CLIP-based mismatch detection (35.2\% to 7.3\%) and LLaVA-driven hallucination checks-ensure factual and visual fidelity. A hybrid retrieval-generation strategy achieves 94.1\% accurate image reuse and 85\% adequacy in synthesis. Our results demonstrate that structured knowledge and multimodal generation together enhance reliability and diversity in food QA.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Nutritional Studies and Diet