Meaning over Motion: A Semantic-First Approach to 360{\deg} Viewport Prediction
Arman Nik Khah, Arvin Bahreini, Ravi Prakash

TL;DR
This paper introduces a semantic-aware viewport prediction framework for 360-degree video streaming that anticipates rapid attention shifts by integrating cognitive intent, significantly reducing stalls and bandwidth use.
Contribution
It presents a novel semantic-first approach with server-side reasoning and client-side control, overcoming the limitations of kinematic-based methods in viewport prediction.
Findings
Reduces stall duration by at least 20%
Lowers bandwidth consumption by at least 18%
Mitigates the Saccade Trap in viewport prediction
Abstract
Ultra-high-resolution 360-degree video streaming is severely constrained by the massive bandwidth required to deliver immersive experiences. Current viewport prediction techniques predominately rely on kinematics or low-level visual saliency, treating users as passive physical objects governed by inertia. This theoretical limitation leads to the "Saccade Trap" -- a critical failure mode where predictors fail to anticipate rapid, meaning-driven shifts in attention, causing rebuffering stalls exactly when user engagement is highest. To resolve this, we propose Semantically-Adaptive Conformal Tiling with Associative Lookahead, a novel framework that integrates cognitive intent into network control. Unlike "one-size-fits-all" approaches, our method utilizes an architectural inversion strategy: heavy semantic reasoning is offloaded to the server to generate lightweight association graphs,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Visual Attention and Saliency Detection · Video Coding and Compression Technologies
