Towards Building a Real Time Mobile Device Bird Counting System Through Synthetic Data Training and Model Compression
Runde Yang

TL;DR
This paper presents a real-time bird counting system using synthetic data training and model compression, achieving high accuracy and efficiency for deployment on mobile devices.
Contribution
It introduces a synthetic data generation method for bird counting and applies model compression to enable real-time mobile deployment.
Findings
Achieved approximately 12.4 MSE on real bird dataset
Reduced model size from 55MB to under 5MB with minimal accuracy loss
Demonstrated effective density map estimation for bird counting
Abstract
Counting the number of birds in an open sky setting has been an challenging problem due to the large number of bird flocks and the birds can overlap. Another difficulty is the lack of accurate training samples since the cost of labeling images of bird flocks can be extremely high and each sample picture can contain thousands of birds in a high resolution image. Inspired by recent work on training with synthetic data to perform crowd counting, we design a mechanism to generate synthetic bird dataset with precise bird count and the corresponding density maps. We then train a Unet model on the synthetic dataset to perform density map estimation that produces the count for each input. Our method is able to achieve MSE of approximately 12.4 on real dataset. In order to build a scalable system for fast bird counting under storage and computational constraints, we use model compression…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnimal Vocal Communication and Behavior · Video Surveillance and Tracking Methods · Music and Audio Processing
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
