Leveraging Image Generators to Address Training Data Scarcity: The Gen4Regen Dataset for Forest Regeneration Mapping
Gabriel Jeanson, David-Alexandre Duclos, William Larriv\'ee-Hardy, No\'e Cochet, Mat\v{e}j Boxan, Anthony Desch\^enes, Fran\c{c}ois Pomerleau, Philippe Gigu\`ere

TL;DR
This paper presents a scalable framework using AI-generated images and masks to improve forest species mapping, significantly reducing reliance on manual annotations and addressing data scarcity.
Contribution
It introduces the Gen4Regen dataset and demonstrates that AI-generated data, combined with real data, enhances segmentation performance in forest regeneration mapping.
Findings
Unified training with real and synthetic data improves F1 scores by over 15%.
Prompt-generated data significantly boosts performance for underrepresented species.
Vision-language models can effectively generate high-fidelity images and masks for niche domains.
Abstract
Sustainable forest management relies on precise species composition mapping, yet traditional ground surveys are labour-intensive and geographically constrained. While Uncrewed Aerial Vehicles (UAVs) offer scalable data collection, the transition to deep learning-based interpretation is bottlenecked by the severe scarcity of expert-annotated imagery, particularly in complex, visually heterogeneous regeneration zones. This paper addresses the dual challenges of data scarcity and extreme class imbalance in the semantic segmentation of fine-grained forest regeneration species by providing a scalable framework that reduces reliance on manual photo-interpretation for high-resolution, millimetre-level aerial imagery. Importantly, we leverage the large-scale vision-language Nano Banana Pro model to simultaneously generate high-fidelity images and their corresponding pixel-aligned semantic masks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
