Face Density as a Proxy for Data Complexity: Quantifying the Hardness of Instance Count
Abolfazl Mohammadi-Seif, Ricardo Baeza-Yates

TL;DR
This paper quantifies how increasing face density in images intrinsically makes data harder for models, affecting performance across tasks and revealing generalization issues related to density domain shifts.
Contribution
It introduces face density as a measurable factor of data complexity, demonstrating its impact on model performance and generalization across various computer vision tasks.
Findings
Model performance decreases monotonically with higher face counts.
Models trained on low-density data struggle to generalize to high-density images.
Density acts as a domain shift, increasing error rates up to 4.6 times.
Abstract
Machine learning progress has historically prioritized model-centric innovations, yet achievable performance is frequently capped by the intrinsic complexity of the data itself. In this work, we isolate and quantify the impact of instance density (measured by face count) as a primary driver of data complexity. Rather than simply observing that ``crowded scenes are harder,'' we rigorously control for class imbalance to measure the precise degradation caused by density alone. Controlled experiments on the WIDER FACE and Open Images datasets, restricted to exactly 1 to 18 faces per image with perfectly balanced sampling, reveal that model performance degrades monotonically with increasing face count. This trend holds across classification, regression, and detection paradigms, even when models are fully exposed to the entire density range. Furthermore, we demonstrate that models trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
