PlantVillageVQA: A Visual Question Answering Dataset for Benchmarking Vision-Language Models in Plant Science
Syed Nazmus Sakib, Nafiul Haque, Mohammad Zabed Hossain, and Shifat E. Arman

TL;DR
PlantVillageVQA is a comprehensive dataset for visual question answering in plant science, designed to improve vision-language models for agricultural diagnostics and research.
Contribution
The paper introduces a large, expert-verified VQA dataset specifically for plant disease diagnosis, with a structured question-answer format grounded in diverse plant images.
Findings
Dataset contains 193,609 QA pairs over 55,448 images.
Evaluated with three state-of-the-art models for quality.
Aims to enhance diagnostic accuracy in plant disease identification.
Abstract
PlantVillageVQA is a large-scale visual question answering (VQA) dataset derived from the widely used PlantVillage image corpus. It was designed to advance the development and evaluation of vision-language models for agricultural decision-making and analysis. The PlantVillageVQA dataset comprises 193,609 high-quality question-answer (QA) pairs grounded over 55,448 images spanning 14 crop species and 38 disease conditions. Questions are organised into 3 levels of cognitive complexity and 9 distinct categories. Each question category was phrased manually following expert guidance and generated via an automated two-stage pipeline: (1) template-based QA synthesis from image metadata and (2) multi-stage linguistic re-engineering. The dataset was iteratively reviewed by domain experts for scientific accuracy and relevancy. The final dataset was evaluated using three state-of-the-art models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Smart Agriculture and AI · Domain Adaptation and Few-Shot Learning
