UDC-VIT: A Real-World Video Dataset for Under-Display Cameras
Kyusu Ahn, JiSoo Kim, Sangik Lee, HyunGyu Lee, Byeonghyun Ko, Chanwoo Park, Jaejin Lee

TL;DR
This paper introduces UDC-VIT, a real-world video dataset capturing under-display camera degradation effects, and demonstrates its importance for training effective face recognition models.
Contribution
The creation of UDC-VIT, the first real-world UDC video dataset focusing on human facial recognition, and the analysis showing the limitations of synthetic datasets.
Findings
Models trained on synthetic datasets perform poorly on real UDC videos.
Effective UDC restoration improves face recognition accuracy.
Frame-by-frame alignment using DFT enhances dataset quality.
Abstract
Even though an Under-Display Camera (UDC) is an advanced imaging system, the display panel significantly degrades captured images or videos, introducing low transmittance, blur, noise, and flare issues. Tackling such issues is challenging because of the complex degradation of UDCs, including diverse flare patterns. However, no dataset contains videos of real-world UDC degradation. In this paper, we propose a real-world UDC video dataset called UDC-VIT. Unlike existing datasets, UDC-VIT exclusively includes human motions for facial recognition. We propose a video-capturing system to acquire clean and UDC-degraded videos of the same scene simultaneously. Then, we align a pair of captured videos frame by frame, using discrete Fourier transform (DFT). We compare UDC-VIT with six representative UDC still image datasets and two existing UDC video datasets. Using six deep-learning models, we…
Peer Reviews
Decision·Submitted to ICLR 2025
This is the first real-world UDC video dataset that accurately represents real-world UDC video degradations. The paper proposes a video-capturing system using a beam splitter to minimize discrepancies between paired frames, which is novel in the UDC field. The paper provides cross-dataset validation experiments and the analysis of limitations of existing datasets such as unrealistic flare occurrences and white artifacts in the supplement. The presentation of the paper is clear and organized.
The theoretical analysis of the limitations of existing datasets and the strength of the proposed new dataset would be better in the main paper instead of the appendix.
The objective of the paper is clear, the overall narrative is fairly complete, and the amount of work is substantial. According to the description in the paper, the collected dataset is more diverse in scenarios compared to previous datasets in the UDC field, and the types of degradation are also closer to real-world conditions.
1. The comparison of Table 3 shows the results of different methods trained and tested on two separate datasets and no cross-testing was conducted. This makes it difficult to assess the impact of different datasets on restoration methods. 2. As mentioned in LIMITATION, UDC degradations vary with the display pixel design, so which types of degradation will be affected? This requires more detailed explanation and analysis.
1、This paper introduces the data set collection and processing process in detail, and the content is clear and concise. 2、This paper analyzes and compares the differences between existing data sets and the collected data sets, highlighting the necessity of creating new data sets. 3、This paper shows the results of the collected data sets in video reconstruction and face recognition, reflecting the effectiveness of this paper's data set.
1、There are some unclear introductions in this paper, such as the corresponding English abbreviations in Figure 1 are not introduced. 2、This paper focuses on describing the implementation details and does not reflect the innovation. 3、The experimental data in this paper is not sufficient to fully demonstrate the advantages of the dataset.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging
MethodsFocus · ALIGN
