Automated dataset generation for image recognition using the example of taxonomy
Jaro Milan Zink

TL;DR
This thesis presents a method for automatically generating image recognition datasets using AI, demonstrated through taxonomic classification, showing it to be a viable and efficient alternative to manual data collection.
Contribution
It introduces a prototype that automates dataset creation with AI filtering, validated by comparison with manual datasets, and discusses extending this approach for detailed taxonomic classification.
Findings
Automated dataset generation is feasible and accurate.
The prototype outperforms manual datasets in specifications and accuracy.
Potential for scalable, AI-driven taxonomic classification systems.
Abstract
This master thesis addresses the subject of automatically generating a dataset for image recognition, which takes a lot of time when being done manually. As the thesis was written with motivation from the context of the biodiversity workgroup at the City University of Applied Sciences Bremen, the classification of taxonomic entries was chosen as an exemplary use case. In order to automate the dataset creation, a prototype was conceptualized and implemented after working out knowledge basics and analyzing requirements for it. It makes use of an pre-trained abstract artificial intelligence which is able to sort out images that do not contain the desired content. Subsequent to the implementation and the automated dataset creation resulting from it, an evaluation was performed. Other, manually collected datasets were compared to the one the prototype produced in means of specifications and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpecies Distribution and Climate Change · Biomedical Text Mining and Ontologies · Data Analysis with R
