ESVQA: Perceptual Quality Assessment of Egocentric Spatial Videos
Xilei Zhu, Huiyu Duan, Liu Yang, Yucheng Zhu, Xiongkuo Min, Guangtao Zhai, Patrick Le Callet

TL;DR
This paper introduces a new database and a novel model for assessing the perceptual quality of egocentric spatial videos, emphasizing embodied experience in XR environments.
Contribution
It presents the first egocentric spatial video quality assessment database and a multi-dimensional binocular feature fusion model, ESVQAnet, for improved quality prediction.
Findings
ESVQAnet outperforms 16 state-of-the-art VQA models.
The database enables better assessment of embodied perceptual quality.
The model generalizes well to traditional VQA tasks.
Abstract
With the rapid development of eXtended Reality (XR), egocentric spatial shooting and display technologies have further enhanced immersion and engagement for users, delivering more captivating and interactive experiences. Assessing the quality of experience (QoE) of egocentric spatial videos is crucial to ensure a high-quality viewing experience. However, the corresponding research is still lacking. In this paper, we use the concept of embodied experience to highlight this more immersive experience and study the new problem, i.e., embodied perceptual quality assessment for egocentric spatial videos. Specifically, we introduce the first Egocentric Spatial Video Quality Assessment Database (ESVQAD), which comprises 600 egocentric spatial videos captured using the Apple Vision Pro and their corresponding mean opinion scores (MOSs). Furthermore, we propose a novel multi-dimensional binocular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Visual Attention and Saliency Detection · Advanced Vision and Imaging
