Zero-Shot Automatic Annotation and Instance Segmentation using   LLM-Generated Datasets: Eliminating Field Imaging and Manual Annotation for   Deep Learning Model Development

Ranjan Sapkota; Achyut Paudel; Manoj Karkee

arXiv:2411.11285·cs.CV·March 3, 2025·5 cites

Zero-Shot Automatic Annotation and Instance Segmentation using LLM-Generated Datasets: Eliminating Field Imaging and Manual Annotation for Deep Learning Model Development

Ranjan Sapkota, Achyut Paudel, Manoj Karkee

PDF

Open Access

TL;DR

This paper introduces a novel approach that uses large language models to generate synthetic orchard images and annotations, enabling effective deep learning-based apple segmentation without field data collection or manual labeling.

Contribution

The study presents a new method combining LLMs, SAM, and YOLO11 to create synthetic datasets for training apple segmentation models, eliminating the need for physical data collection and manual annotation.

Findings

01

Achieved high Dice Coefficient of 0.9513 and IoU of 0.9303 on synthetic annotations.

02

YOLO11 models trained solely on synthetic data accurately recognized apples in real orchard images.

03

Outperformed other models with a mask precision of 0.902 and mAP@50 of 0.833.

Abstract

Currently, deep learning-based instance segmentation for various applications (e.g., Agriculture) is predominantly performed using a labor-intensive process involving extensive field data collection using sophisticated sensors, followed by careful manual annotation of images, presenting significant logistical and financial challenges to researchers and organizations. The process also slows down the model development and training process. In this study, we presented a novel method for deep learning-based instance segmentation of apples in commercial orchards that eliminates the need for labor-intensive field data collection and manual annotation. Utilizing a Large Language Model (LLM), we synthetically generated orchard images and automatically annotated them using the Segment Anything Model (SAM) integrated with a YOLO11 base model. This method significantly reduces reliance on physical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction

MethodsBalanced Selection · Segment Anything Model