Count, Crop and Recognise: Fine-Grained Recognition in the Wild

Max Bain; Arsha Nagrani; Daniel Schofield; Andrew Zisserman

arXiv:1909.08950·cs.CV·October 10, 2019

Count, Crop and Recognise: Fine-Grained Recognition in the Wild

Max Bain, Arsha Nagrani, Daniel Schofield, Andrew Zisserman

PDF

TL;DR

This paper presents a multistage CNN-based approach for fine-grained animal recognition in videos, capable of labeling individuals without visible faces, and introduces a new chimpanzee dataset and visualization techniques.

Contribution

It introduces the CCR multistage recognition process, compares frame-based and track-based labeling, and provides a new wild chimpanzee dataset with feature visualization.

Findings

01

CCR improves recognition performance significantly

02

Frame-based labeling outperforms track-based methods

03

New chimpanzee dataset enables wild recognition research

Abstract

The goal of this paper is to label all the animal individuals present in every frame of a video. Unlike previous methods that have principally concentrated on labelling face tracks, we aim to label individuals even when their faces are not visible. We make the following contributions: (i) we introduce a 'Count, Crop and Recognise' (CCR) multistage recognition process for frame level labelling. The Count and Recognise stages involve specialised CNNs for the task, and we show that this simple staging gives a substantial boost in performance; (ii) we compare the recall using frame based labelling to both face and body track based labelling, and demonstrate the advantage of frame based with CCR for the specified goal; (iii) we introduce a new dataset for chimpanzee recognition in the wild; and (iv) we apply a high-granularity visualisation technique to further understand the learned CNN…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.