Sparse Data Tree Canopy Segmentation: Fine-Tuning Leading Pretrained Models on Only 150 Images

David Szczecina; Hudson Sun; Anthony Bertnyk; Niloofar Azad; Kyle Gao; Lincoln Linlin Xu

arXiv:2601.10931·cs.CV·May 6, 2026

Sparse Data Tree Canopy Segmentation: Fine-Tuning Leading Pretrained Models on Only 150 Images

David Szczecina, Hudson Sun, Anthony Bertnyk, Niloofar Azad, Kyle Gao, Lincoln Linlin Xu

PDF

TL;DR

This study evaluates the performance of various pretrained models for tree canopy segmentation on a very limited dataset of 150 images, highlighting the superiority of CNN-based models over transformer-based ones in low-data scenarios.

Contribution

It provides a comparative analysis of five architectures, emphasizing the effectiveness of pretrained CNNs like YOLOv11 and Mask R-CNN for small-data canopy segmentation tasks.

Findings

01

Pretrained CNN models outperform transformer-based models in low-data regimes.

02

YOLOv11 and Mask R-CNN generalize better than DeepLabV3, Swin-UNet, and DINOv2.

03

Transformer models require more data or pretraining to perform well in segmentation tasks.

Abstract

Tree canopy detection from aerial imagery is an important task for environmental monitoring, urban planning, and ecosystem analysis. Simulating real-life data annotation scarcity, the Solafune Tree Canopy Detection competition provides a small and imbalanced dataset of only 150 annotated images, posing significant challenges for training deep models without severe overfitting. In this work, we evaluate five representative architectures, YOLOv11, Mask R-CNN, DeepLabv3, Swin-UNet, and DINOv2, to assess their suitability for canopy segmentation under extreme data scarcity. Our experiments show that pretrained convolution-based models, particularly YOLOv11 and Mask R-CNN, generalize significantly better than pretrained transformer-based models. DeeplabV3, Swin-UNet and DINOv2 underperform likely due to differences between semantic and instance segmentation tasks, the high data requirements…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.