Understanding User Behavior in Volumetric Video Watching: Dataset, Analysis and Prediction
Kaiyuan Hu, Haowen Yang, Yili Jin, Junhua Liu, Yongting Chen, Miao, Zhang, Fangxin Wang

TL;DR
This paper introduces a large-scale volumetric video viewing behavior dataset, analyzes user interaction patterns, and proposes a transformer-based model for viewport prediction to enhance streaming efficiency.
Contribution
First to release a comprehensive volumetric video user behavior dataset, providing insights and a predictive model to optimize streaming based on user interactions.
Findings
User viewport and gaze behaviors vary significantly across videos and users.
The transformer-based prediction model achieves high accuracy in viewport prediction.
Analysis reveals key user preferences in volumetric video viewing.
Abstract
Volumetric video emerges as a new attractive video paradigm in recent years since it provides an immersive and interactive 3D viewing experience with six degree-of-freedom (DoF). Unlike traditional 2D or panoramic videos, volumetric videos require dense point clouds, voxels, meshes, or huge neural models to depict volumetric scenes, which results in a prohibitively high bandwidth burden for video delivery. Users' behavior analysis, especially the viewport and gaze analysis, then plays a significant role in prioritizing the content streaming within users' viewport and degrading the remaining content to maximize user QoE with limited bandwidth. Although understanding user behavior is crucial, to the best of our best knowledge, there are no available 3D volumetric video viewing datasets containing fine-grained user interactivity features, not to mention further analysis and behavior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Visual Attention and Saliency Detection · Video Coding and Compression Technologies
