Attention Guidance through Video Script: A Case Study of Object Focusing on 360{\deg} VR Video Tours
Paulo Vitor Santana Silva, Arthur Ricardo Sousa Vit\'oria, Diogo Fernandes Costa Silva, Arlindo Rodrigues Galv\~ao Filho

TL;DR
This paper explores how combining AI models Grounding Dino and SAM with video scripts can effectively guide viewer attention in 360-degree VR video tours, enhancing user experience.
Contribution
It introduces a novel method integrating object grounding models with video scripts to improve attention guidance in immersive VR environments.
Findings
Video scripts improve user attention focus in VR tours
Combining Grounding Dino and SAM effectively guides viewers
Enhanced user experience demonstrated in case study
Abstract
Within the expansive domain of virtual reality (VR), 360{\deg} VR videos immerse viewers in a spherical environment, allowing them to explore and interact with the virtual world from all angles. While this video representation offers unparalleled levels of immersion, it often lacks effective methods to guide viewers' attention toward specific elements within the virtual environment. This paper combines the models Grounding Dino and Segment Anything (SAM) to guide attention by object focusing based on video scripts. As a case study, this work conducts the experiments on a 360{\deg} video tour on the University of Reading. The experiment results show that video scripts can improve the user experience in 360{\deg} VR Videos Tour by helping in the task of directing the user's attention.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVirtual Reality Applications and Impacts · Human Motion and Animation · Visual Attention and Saliency Detection
