Track Any Peppers: Weakly Supervised Sweet Pepper Tracking Using VLMs

Jia Syuen Lim; Yadan Luo; Zhi Chen; Tianqi Wei; Scott Chapman; and Zi; Huang

arXiv:2411.06702·cs.CV·November 12, 2024

Track Any Peppers: Weakly Supervised Sweet Pepper Tracking Using VLMs

Jia Syuen Lim, Yadan Luo, Zhi Chen, Tianqi Wei, Scott Chapman, and Zi, Huang

PDF

Open Access

TL;DR

This paper introduces TAP, a weakly supervised method leveraging foundation models for accurate sweet pepper detection and tracking in videos with minimal manual labeling, achieving high performance metrics.

Contribution

We propose TAP, a novel ensemble approach combining vision-language models and traditional tracking algorithms for efficient agricultural object tracking.

Findings

01

Achieved HOTA score of 80.4% in sweet pepper tracking

02

Reduced manual labeling through pseudo-label generation

03

Enhanced detection accuracy with relighting and depth filtering

Abstract

In the Detection and Multi-Object Tracking of Sweet Peppers Challenge, we present Track Any Peppers (TAP) - a weakly supervised ensemble technique for sweet peppers tracking. TAP leverages the zero-shot detection capabilities of vision-language foundation models like Grounding DINO to automatically generate pseudo-labels for sweet peppers in video sequences with minimal human intervention. These pseudo-labels, refined when necessary, are used to train a YOLOv8 segmentation network. To enhance detection accuracy under challenging conditions, we incorporate pre-processing techniques such as relighting adjustments and apply depth-based filtering during post-inference. For object tracking, we integrate the Matching by Segment Anything (MASA) adapter with the BoT-SORT algorithm. Our approach achieves a HOTA score of 80.4%, MOTA of 66.1%, Recall of 74.0%, and Precision of 90.7%, demonstrating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPlant Pathogens and Fungal Diseases · Advanced Chemical Sensor Technologies · Insect Pheromone Research and Control

MethodsAttention Is All You Need · Softmax · Linear Layer · Dense Connections · Layer Normalization · Multi-Head Attention · Residual Connection · Vision Transformer · You Only Look Once · Adapter