Video Face Re-Aging: Toward Temporally Consistent Face Re-Aging
Abdul Muqeet, Kyuchul Lee, Bumsoo Kim, Yohan Hong, Hyungrae Lee,, Woonggon Kim, KwangHee Lee

TL;DR
This paper introduces a new synthetic video dataset, a baseline architecture, and novel metrics for evaluating temporally consistent face re-aging in videos, demonstrating improved performance over existing methods.
Contribution
The work presents a synthetic video dataset, a baseline model, and new evaluation metrics specifically designed for temporally consistent face re-aging.
Findings
Outperforms existing methods in age transformation accuracy.
Achieves higher temporal consistency in video re-aging.
User studies favor the proposed method for temporal coherence.
Abstract
Video face re-aging deals with altering the apparent age of a person to the target age in videos. This problem is challenging due to the lack of paired video datasets maintaining temporal consistency in identity and age. Most re-aging methods process each image individually without considering the temporal consistency of videos. While some existing works address the issue of temporal coherence through video facial attribute manipulation in latent space, they often fail to deliver satisfactory performance in age transformation. To tackle the issues, we propose (1) a novel synthetic video dataset that features subjects across a diverse range of age groups; (2) a baseline architecture designed to validate the effectiveness of our proposed dataset, and (3) the development of novel metrics tailored explicitly for evaluating the temporal consistency of video re-aging techniques. Our…
Peer Reviews
Decision·Submitted to ICLR 2025
1. Innovative Data Generation Pipeline: The authors designed a comprehensive pipeline for generating a synthetic dataset specifically for model training in video face re-aging. This pipeline addresses the challenge of obtaining paired video data with consistent identities and varying ages, thereby enhancing the quality and applicability of the training data. 2. Introduction of New Evaluation Metrics: The development of two novel metrics, Time Region Wrinkle Consistency (TRWC) and Time-Age Prese
1. One significant shortcoming of the paper lies in its experimental section, which lacks thoroughness and depth. Specifically, the evaluation of the proposed new metrics includes only three baselines, and the quantitative comparisons in the User Study Results are limited to just two baselines. While the paper presents qualitative comparisons with various methods, these are not sufficiently persuasive without robust quantitative backing. Furthermore, the authors do not demonstrate the performanc
- The paper addresses temporal consistency factor of video face aging. This is a challenging factor in this topic. - The paper has introduced both data generation; architecture and metrics for video face aging.
The novelty of the paper is limited as most sections are "inspired" or "motivated" from previous approaches. Particularly: - For data generation process, it relied on StyleGAN and SAM to generate aging results for single frames. Then OSFV technique is adopted to generate faces at different poses and expressions for key frames and motion generation for temporal smoothing. - For video aging architecture, it is not novel as it is just a recurrent U-Net with commonly used losses. The structure of t
1. Establishes a Strong Baseline: It introduces a new baseline for video re-aging, with novel contributions to architecture, dataset creation, and evaluation metrics. This provides a valuable foundation for future research in this area. 2. Demonstrates the Effectiveness of Synthetic Data: The proposed approach, while architecturally simple, effectively leverages synthetic video datasets to achieve compelling results. This highlights the potential of synthetic data for training re-aging models. 3
1. Lack of Detail Regarding the Synthetic Dataset: The authors provide insufficient information about their synthetic dataset. To enable a comprehensive evaluation, the authors should provide detailed information about the dataset's size, diversity (including the range of ages, facial features, and other relevant attributes), and visual samples. This would allow reviewers to assess the dataset's quality and its potential impact on the reported results. 2. Missing Information on Motion Generation
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis
