Face Density as a Proxy for Data Complexity: Quantifying the Hardness of Instance Count

Abolfazl Mohammadi-Seif; Ricardo Baeza-Yates

arXiv:2604.09689·cs.CV·April 15, 2026

Face Density as a Proxy for Data Complexity: Quantifying the Hardness of Instance Count

Abolfazl Mohammadi-Seif, Ricardo Baeza-Yates

PDF

TL;DR

This paper quantifies how increasing face density in images intrinsically makes data harder for models, affecting performance across tasks and revealing generalization issues related to density domain shifts.

Contribution

It introduces face density as a measurable factor of data complexity, demonstrating its impact on model performance and generalization across various computer vision tasks.

Findings

01

Model performance decreases monotonically with higher face counts.

02

Models trained on low-density data struggle to generalize to high-density images.

03

Density acts as a domain shift, increasing error rates up to 4.6 times.

Abstract

Machine learning progress has historically prioritized model-centric innovations, yet achievable performance is frequently capped by the intrinsic complexity of the data itself. In this work, we isolate and quantify the impact of instance density (measured by face count) as a primary driver of data complexity. Rather than simply observing that ``crowded scenes are harder,'' we rigorously control for class imbalance to measure the precise degradation caused by density alone. Controlled experiments on the WIDER FACE and Open Images datasets, restricted to exactly 1 to 18 faces per image with perfectly balanced sampling, reveal that model performance degrades monotonically with increasing face count. This trend holds across classification, regression, and detection paradigms, even when models are fully exposed to the entire density range. Furthermore, we demonstrate that models trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.