AgroVG: A Large-Scale Multi-Source Benchmark for Agricultural Visual Grounding

Haocheng Li; Juepeng Zheng; Zenghao Yang; Kaiqi Du; Guilong Xiao; Gengmeng Pu; Haohuan Fu; Jianxi Huang

arXiv:2605.22034·cs.CV·May 22, 2026

AgroVG: A Large-Scale Multi-Source Benchmark for Agricultural Visual Grounding

Haocheng Li, Juepeng Zheng, Zenghao Yang, Kaiqi Du, Guilong Xiao, Gengmeng Pu, Haohuan Fu, Jianxi Huang

PDF

1 Repo

TL;DR

AgroVG is a comprehensive benchmark dataset designed to evaluate agricultural visual grounding models across multiple sources, target types, and grounding tasks, highlighting current performance gaps.

Contribution

This paper introduces AgroVG, a large-scale, multi-source benchmark for agricultural visual grounding, supporting diverse tasks and providing a standardized evaluation protocol.

Findings

01

Zero-shot models perform poorly on multi-target set prediction.

02

Best models achieve only 0.35 Set-F1 score for multi-target localization.

03

Mask success rate at [email protected] remains below 0.17.

Abstract

Visual grounding, the task of localizing objects described by natural-language expressions, is a foundational capability for agricultural AI systems, enabling applications such as selective weeding, disease monitoring, and targeted harvesting. Reliable evaluation of agricultural visual grounding remains challenging because agricultural targets are often small, repetitive, occluded, or irregularly shaped, and instructions may refer to one, many, or no objects in an image. Evaluating this capability therefore requires jointly testing localization accuracy, target-set completeness, and existence-aware abstention. To address these challenges, we introduce \textbf{AgroVG}, a multi-source benchmark that formulates agricultural grounding as generalized set prediction: given an image and a referring expression, a model must return all matching target instances or abstain when no target is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://anonymous.4open.science/r/AgroVG-5172
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.