Lightweight Real-time Makeup Try-on in Mobile Browsers with Tiny CNN Models for Facial Tracking
TianXing Li, Zhi Yu, Edmund Phung, Brendan Duke, Irina Kezele, Parham, Aarabi

TL;DR
This paper introduces a lightweight, fast, and accurate facial alignment CNN model optimized for resource-constrained devices, enabling real-time makeup try-on directly in mobile browsers.
Contribution
The authors design a compact CNN architecture for facial alignment that maintains high accuracy while enabling real-time performance on mobile devices.
Findings
Achieved high accuracy in facial landmark detection with small models
Enabled real-time makeup try-on in smartphone browsers
Demonstrated suitability for resource-limited environments
Abstract
Recent works on convolutional neural networks (CNNs) for facial alignment have demonstrated unprecedented accuracy on a variety of large, publicly available datasets. However, the developed models are often both cumbersome and computationally expensive, and are not adapted to applications on resource restricted devices. In this work, we look into developing and training compact facial alignment models that feature fast inference speed and small deployment size, making them suitable for applications on the aforementioned category of devices. Our main contribution lies in designing such small models while maintaining high accuracy of facial alignment. The models we propose make use of light CNN architectures adapted to the facial alignment problem for accurate two-stage prediction of facial landmark coordinates from low-resolution output heatmaps. We further combine the developed facial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Video Surveillance and Tracking Methods · Face Recognition and Perception
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
