ESVQA: Perceptual Quality Assessment of Egocentric Spatial Videos

Xilei Zhu; Huiyu Duan; Liu Yang; Yucheng Zhu; Xiongkuo Min; Guangtao Zhai; Patrick Le Callet

arXiv:2412.20423·cs.CV·August 8, 2025·2 cites

ESVQA: Perceptual Quality Assessment of Egocentric Spatial Videos

Xilei Zhu, Huiyu Duan, Liu Yang, Yucheng Zhu, Xiongkuo Min, Guangtao Zhai, Patrick Le Callet

PDF

Open Access

TL;DR

This paper introduces a new database and a novel model for assessing the perceptual quality of egocentric spatial videos, emphasizing embodied experience in XR environments.

Contribution

It presents the first egocentric spatial video quality assessment database and a multi-dimensional binocular feature fusion model, ESVQAnet, for improved quality prediction.

Findings

01

ESVQAnet outperforms 16 state-of-the-art VQA models.

02

The database enables better assessment of embodied perceptual quality.

03

The model generalizes well to traditional VQA tasks.

Abstract

With the rapid development of eXtended Reality (XR), egocentric spatial shooting and display technologies have further enhanced immersion and engagement for users, delivering more captivating and interactive experiences. Assessing the quality of experience (QoE) of egocentric spatial videos is crucial to ensure a high-quality viewing experience. However, the corresponding research is still lacking. In this paper, we use the concept of embodied experience to highlight this more immersive experience and study the new problem, i.e., embodied perceptual quality assessment for egocentric spatial videos. Specifically, we introduce the first Egocentric Spatial Video Quality Assessment Database (ESVQAD), which comprises 600 egocentric spatial videos captured using the Apple Vision Pro and their corresponding mean opinion scores (MOSs). Furthermore, we propose a novel multi-dimensional binocular…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Video Quality Assessment · Visual Attention and Saliency Detection · Advanced Vision and Imaging