A SAM-based Solution for Hierarchical Panoptic Segmentation of Crops and Weeds Competition
Khoa Dang Nguyen, Thanh-Hai Phung, Hoang-Giang Cao

TL;DR
This paper presents a novel hierarchical panoptic segmentation method for agriculture that combines SAM with object detection models DINO and YOLO-v8, achieving high accuracy on crop and weed segmentation tasks.
Contribution
The paper introduces a SAM-based approach integrated with DINO and YOLO-v8 for hierarchical panoptic segmentation in agriculture, demonstrating improved performance on the PhenoBench dataset.
Findings
Achieved a PQ+ score of 81.33 in the competition
Effectively combined SAM with object detection models
Enhanced segmentation accuracy for crops and weeds
Abstract
Panoptic segmentation in agriculture is an advanced computer vision technique that provides a comprehensive understanding of field composition. It facilitates various tasks such as crop and weed segmentation, plant panoptic segmentation, and leaf instance segmentation, all aimed at addressing challenges in agriculture. Exploring the application of panoptic segmentation in agriculture, the 8th Workshop on Computer Vision in Plant Phenotyping and Agriculture (CVPPA) hosted the challenge of hierarchical panoptic segmentation of crops and weeds using the PhenoBench dataset. To tackle the tasks presented in this competition, we propose an approach that combines the effectiveness of the Segment AnyThing Model (SAM) for instance segmentation with prompt input from object detection models. Specifically, we integrated two notable approaches in object detection, namely DINO and YOLO-v8. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Agriculture and AI · Remote Sensing in Agriculture
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Residual Connection · Layer Normalization · Dense Connections · Vision Transformer · self-DIstillation with NO labels
