Egocentric Human Trajectory Forecasting with a Wearable Camera and   Multi-Modal Fusion

Jianing Qiu; Lipeng Chen; Xiao Gu; Frank P.-W. Lo; Ya-Yen Tsai,; Jiankai Sun; Jiaqi Liu; Benny Lo

arXiv:2111.00993·cs.CV·July 8, 2022

Egocentric Human Trajectory Forecasting with a Wearable Camera and Multi-Modal Fusion

Jianing Qiu, Lipeng Chen, Xiao Gu, Frank P.-W. Lo, Ya-Yen Tsai,, Jiankai Sun, Jiaqi Liu, Benny Lo

PDF

1 Repo

TL;DR

This paper introduces a novel multi-modal Transformer-based approach for egocentric human trajectory forecasting using a new dataset, improving accuracy in predicting future paths in crowded environments.

Contribution

The paper presents a new dataset and a multi-modal Transformer model with a cascaded cross-attention mechanism for egocentric trajectory prediction, advancing prior methods.

Findings

01

Our model outperforms state-of-the-art in trajectory forecasting accuracy.

02

Multi-modal fusion improves prediction by incorporating scene semantics and depth.

03

The dataset enables better understanding of egocentric navigation in crowded spaces.

Abstract

In this paper, we address the problem of forecasting the trajectory of an egocentric camera wearer (ego-person) in crowded spaces. The trajectory forecasting ability learned from the data of different camera wearers walking around in the real world can be transferred to assist visually impaired people in navigation, as well as to instill human navigation behaviours in mobile robots, enabling better human-robot interactions. To this end, a novel egocentric human trajectory forecasting dataset was constructed, containing real trajectories of people navigating in crowded spaces wearing a camera, as well as extracted rich contextual data. We extract and utilize three different modalities to forecast the trajectory of the camera wearer, i.e., his/her past trajectory, the past trajectories of nearby people, and the environment such as the scene semantics or the depth of the scene. A…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jianing-qiu/tiss
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.