Vision-Language Semantic Grounding for Multi-Domain Crop-Weed Segmentation
Nazia Hossain, Xintong Jiang, Yu Tian, Philippe Seguin, O. Grant Clark, Shangpeng Sun

TL;DR
This paper introduces VL-WS, a vision-language grounded segmentation framework that generalizes across diverse agricultural environments by leveraging semantic alignment and domain-invariant features, significantly improving weed segmentation accuracy.
Contribution
The novel VL-WS framework combines CLIP embeddings with spatial features using FiLM layers, enabling cross-domain generalization and label-efficient weed segmentation in precision agriculture.
Findings
Achieves 91.64% mean Dice score on benchmark datasets.
Outperforms CNN baseline by 4.98% in Dice score.
Improves weed class segmentation, reaching 80.45% Dice score.
Abstract
Fine-grained crop-weed segmentation is essential for enabling targeted herbicide application in precision agriculture. However, existing deep learning models struggle to generalize across heterogeneous agricultural environments due to reliance on dataset-specific visual features. We propose Vision-Language Weed Segmentation (VL-WS), a novel framework that addresses this limitation by grounding pixel-level segmentation in semantically aligned, domain-invariant representations. Our architecture employs a dual-encoder design, where frozen Contrastive Language-Image Pretraining (CLIP) embeddings and task-specific spatial features are fused and modulated via Feature-wise Linear Modulation (FiLM) layers conditioned on natural language captions. This design enables image level textual descriptions to guide channel-wise feature refinement while preserving fine-grained spatial localization.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Agriculture and AI · Advanced Neural Network Applications · Remote Sensing in Agriculture
