Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data
Jing Wang, Yu Cheng, Rogerio Schmidt Feris

TL;DR
This paper introduces a novel approach to facial attribute classification using egocentric video and contextual data, reducing manual annotation and leveraging real-world walking scenarios to learn rich feature representations.
Contribution
It presents a new method that utilizes egocentric video and contextual information to learn facial attributes without manual annotation, outperforming existing techniques.
Findings
Outperforms networks trained from scratch on benchmark datasets.
Achieves state-of-the-art results without manual identity labels.
Effectively captures facial attribute features using contextual data.
Abstract
The way people look in terms of facial attributes (ethnicity, hair color, facial hair, etc.) and the clothes or accessories they wear (sunglasses, hat, hoodies, etc.) is highly dependent on geo-location and weather condition, respectively. This work explores, for the first time, the use of this contextual information, as people with wearable cameras walk across different neighborhoods of a city, in order to learn a rich feature representation for facial attribute classification, without the costly manual annotation required by previous methods. By tracking the faces of casual walkers on more than 40 hours of egocentric video, we are able to cover tens of thousands of different identities and automatically extract nearly 5 million pairs of images connected by or from different face tracks, along with their weather and location context, under pose and lighting variations. These image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Walk and Learn: Facial Attribute Representation Learning From Egocentric Video and Contextual Data· youtube
Taxonomy
TopicsFace recognition and analysis · Video Surveillance and Tracking Methods · Face and Expression Recognition
