Entry-Flipped Transformer for Inference and Prediction of Participant Behavior
Bo Hu, Tat-Jen Cham

TL;DR
This paper introduces the EF-Transformer, a novel model that improves inference and prediction of participant behavior in group activities by reducing error accumulation through a unique attention mechanism, demonstrated on multiple datasets.
Contribution
The paper presents the Entry-Flipped Transformer, a new attention-based model that enhances behavior inference accuracy and robustness in interactive group scenarios.
Findings
Achieves state-of-the-art performance on tennis doubles, dance, and pedestrian datasets.
Better at limiting error accumulation and recovering from wrong estimations.
Outperforms existing models in behavior prediction tasks.
Abstract
Some group activities, such as team sports and choreographed dances, involve closely coupled interaction between participants. Here we investigate the tasks of inferring and predicting participant behavior, in terms of motion paths and actions, under such conditions. We narrow the problem to that of estimating how a set target participants react to the behavior of other observed participants. Our key idea is to model the spatio-temporal relations among participants in a manner that is robust to error accumulation during frame-wise inference and prediction. We propose a novel Entry-Flipped Transformer (EF-Transformer), which models the relations of participants by attention mechanisms on both spatial and temporal domains. Unlike typical transformers, we tackle the problem of error accumulation by flipping the order of query, key, and value entries, to increase the importance and fidelity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Time Series Analysis and Forecasting
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Dense Connections · Absolute Position Encodings · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Adam
