Visual-Language-Guided Task Planning for Horticultural Robots
Jose Cuaran, Kendall Koe, Aditya Potnis, Naveen Kumar Uppalapati, Girish Chowdhary

TL;DR
This paper presents a modular visual-language-guided framework for robotic crop monitoring, highlighting its strengths in short-term tasks and limitations in long-term, noisy scenarios, advancing agricultural robotics capabilities.
Contribution
Introduces a comprehensive benchmark and a novel VLM-based framework for high-level reasoning in crop monitoring robotics.
Findings
VLMs perform well on short-horizon tasks, comparable to humans.
Performance drops significantly on long-horizon tasks.
System struggles with noisy semantic maps, revealing limitations in current VLM grounding.
Abstract
Crop monitoring is essential for precision agriculture, but current systems lack high-level reasoning. We introduce a novel, modular framework that uses a Visual Language Model (VLM) to guide robotic task planning, interleaving input queries with action primitives. We contribute a comprehensive benchmark for short- and long-horizon crop monitoring tasks in monoculture and polyculture environments. Our main results show that VLMs perform robustly for short-horizon tasks (comparable to human success), but exhibit significant performance degradation in challenging long-horizon tasks. Critically, the system fails when relying on noisy semantic maps, demonstrating a key limitation in current VLM context grounding for sustained robotic operations. This work offers a deployable framework and critical insights into VLM capabilities and shortcomings for complex agricultural robotics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Smart Agriculture and AI · Advanced Neural Network Applications
