DiVR: incorporating context from diverse VR scenes for human trajectory   prediction

Franz Franco Gallo (BIOVISION); Hui-Yin Wu (BIOVISION); Lucile; Sassatelli (UniCA; IUF)

arXiv:2411.08409·cs.AI·November 14, 2024

DiVR: incorporating context from diverse VR scenes for human trajectory prediction

Franz Franco Gallo (BIOVISION), Hui-Yin Wu (BIOVISION), Lucile, Sassatelli (UniCA, IUF)

PDF

Open Access

TL;DR

This paper introduces DiVR, a novel transformer-based model that leverages diverse virtual reality scene contexts to improve human trajectory prediction, demonstrating superior accuracy and adaptability over existing methods.

Contribution

The work presents DiVR, a cross-modal transformer architecture that integrates static and dynamic scene context using heterogeneous graph convolution, specifically designed for VR environments.

Findings

01

DiVR outperforms existing models like MLP, LSTM, and transformers in accuracy.

02

DiVR demonstrates strong generalizability across users, tasks, and scenes.

03

Using VR datasets enhances context-aware human trajectory modeling.

Abstract

Virtual environments provide a rich and controlled setting for collecting detailed data on human behavior, offering unique opportunities for predicting human trajectories in dynamic scenes. However, most existing approaches have overlooked the potential of these environments, focusing instead on static contexts without considering userspecific factors. Employing the CREATTIVE3D dataset, our work models trajectories recorded in virtual reality (VR) scenes for diverse situations including road-crossing tasks with user interactions and simulated visual impairments. We propose Diverse Context VR Human Motion Prediction (DiVR), a cross-modal transformer based on the Perceiver architecture that integrates both static and dynamic scene context using a heterogeneous graph convolution network. We conduct extensive experiments comparing DiVR against existing architectures including MLP, LSTM, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Convolution