The Herbarium 2021 Half-Earth Challenge Dataset
Riccardo de Lutio, Damon Little, Barbara Ambrose, Serge Belongie

TL;DR
The Herbarium 2021 Half-Earth Challenge Dataset is the largest and most diverse collection of digitized herbarium specimens, enabling improved automatic plant identification and supporting botanical research.
Contribution
This paper introduces the Herbarium Half-Earth dataset, addressing limitations of previous datasets by offering extensive diversity and size for taxon recognition tasks.
Findings
Largest herbarium dataset to date
Enhanced diversity in taxa and geographic representation
Facilitates automatic plant identification research
Abstract
Herbarium sheets present a unique view of the world's botanical history, evolution, and diversity. This makes them an all-important data source for botanical research. With the increased digitisation of herbaria worldwide and the advances in the fine-grained classification domain that can facilitate automatic identification of herbarium specimens, there are a lot of opportunities for supporting research in this field. However, existing datasets are either too small, or not diverse enough, in terms of represented taxa, geographic distribution or host institutions. Furthermore, aggregating multiple datasets is difficult as taxa exist under a multitude of different names and the taxonomy requires alignment to a common reference. We present the Herbarium Half-Earth dataset, the largest and most diverse dataset of herbarium specimens to date for automatic taxon recognition.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpecies Distribution and Climate Change · Plant Pathogens and Fungal Diseases · Genomics and Phylogenetic Studies
