Entry-Flipped Transformer for Inference and Prediction of Participant   Behavior

Bo Hu; Tat-Jen Cham

arXiv:2207.06235·cs.CV·July 15, 2022

Entry-Flipped Transformer for Inference and Prediction of Participant Behavior

Bo Hu, Tat-Jen Cham

PDF

Open Access

TL;DR

This paper introduces the EF-Transformer, a novel model that improves inference and prediction of participant behavior in group activities by reducing error accumulation through a unique attention mechanism, demonstrated on multiple datasets.

Contribution

The paper presents the Entry-Flipped Transformer, a new attention-based model that enhances behavior inference accuracy and robustness in interactive group scenarios.

Findings

01

Achieves state-of-the-art performance on tennis doubles, dance, and pedestrian datasets.

02

Better at limiting error accumulation and recovering from wrong estimations.

03

Outperforms existing models in behavior prediction tasks.

Abstract

Some group activities, such as team sports and choreographed dances, involve closely coupled interaction between participants. Here we investigate the tasks of inferring and predicting participant behavior, in terms of motion paths and actions, under such conditions. We narrow the problem to that of estimating how a set target participants react to the behavior of other observed participants. Our key idea is to model the spatio-temporal relations among participants in a manner that is robust to error accumulation during frame-wise inference and prediction. We propose a novel Entry-Flipped Transformer (EF-Transformer), which models the relations of participants by attention mechanisms on both spatial and temporal domains. Unlike typical transformers, we tackle the problem of error accumulation by flipping the order of query, key, and value entries, to increase the importance and fidelity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Time Series Analysis and Forecasting

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Dense Connections · Absolute Position Encodings · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Adam