DiVR: incorporating context from diverse VR scenes for human trajectory prediction
Franz Franco Gallo (BIOVISION), Hui-Yin Wu (BIOVISION), Lucile, Sassatelli (UniCA, IUF)

TL;DR
This paper introduces DiVR, a novel transformer-based model that leverages diverse virtual reality scene contexts to improve human trajectory prediction, demonstrating superior accuracy and adaptability over existing methods.
Contribution
The work presents DiVR, a cross-modal transformer architecture that integrates static and dynamic scene context using heterogeneous graph convolution, specifically designed for VR environments.
Findings
DiVR outperforms existing models like MLP, LSTM, and transformers in accuracy.
DiVR demonstrates strong generalizability across users, tasks, and scenes.
Using VR datasets enhances context-aware human trajectory modeling.
Abstract
Virtual environments provide a rich and controlled setting for collecting detailed data on human behavior, offering unique opportunities for predicting human trajectories in dynamic scenes. However, most existing approaches have overlooked the potential of these environments, focusing instead on static contexts without considering userspecific factors. Employing the CREATTIVE3D dataset, our work models trajectories recorded in virtual reality (VR) scenes for diverse situations including road-crossing tasks with user interactions and simulated visual impairments. We propose Diverse Context VR Human Motion Prediction (DiVR), a cross-modal transformer based on the Perceiver architecture that integrates both static and dynamic scene context using a heterogeneous graph convolution network. We conduct extensive experiments comparing DiVR against existing architectures including MLP, LSTM, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Convolution
