Not All Birds Look The Same: Identity-Preserving Generation For Birds

Aaron Sun; Oindrila Saha; Subhransu Maji

arXiv:2512.04485·cs.CV·April 2, 2026

Not All Birds Look The Same: Identity-Preserving Generation For Birds

Aaron Sun, Oindrila Saha, Subhransu Maji

PDF

TL;DR

This paper introduces a new bird dataset and benchmark to evaluate identity-preserving image generation, highlighting current models' limitations and proposing improvements through species, age, and sex grouping.

Contribution

The creation of the NABLA dataset and benchmark for evaluating identity preservation in bird image generation, addressing a gap in fine-grained, non-rigid object modeling.

Findings

01

State-of-the-art models fail to preserve identity on the NABLA dataset.

02

Training with grouped images by species, age, and sex improves identity preservation.

03

The benchmark enables better evaluation of fine-grained, identity-preserving generation methods.

Abstract

Since the advent of controllable image generation, increasingly rich modes of control have enabled greater customization and accessibility for everyday users. Zero-shot, identity-preserving models such as Insert Anything and OminiControl now support applications like virtual try-on without requiring additional fine-tuning. While these models may be fitting for humans and rigid everyday objects, they still have limitations for non-rigid or fine-grained categories. These domains often lack accessible, high-quality data -- especially videos or multi-view observations of the same subject -- making them difficult both to evaluate and to improve upon. Yet, such domains are essential for moving beyond content creation toward applications that demand accuracy and fine detail. Birds are an excellent domain for this task: they exhibit high diversity, require fine-grained cues for identification,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.