MoReFun: Past-Movement Guided Motion Representation Learning for Future Motion Prediction and Understanding
Junyu Shi, Haoting Wu, Zhiyuan Zhang, Lijiang Liu, Yong Sun, Qiang Nie

TL;DR
MoReFun introduces a two-stage self-supervised learning framework for 3D human motion prediction that improves accuracy and motion understanding by decoupling representation learning from prediction and focusing on dynamic joints.
Contribution
The paper proposes a novel two-stage self-supervised framework with velocity-based masking to enhance motion representation learning and prediction accuracy.
Findings
Reduces average prediction errors by 8.8% over state-of-the-art methods.
Effectively captures complex motion dynamics and dependencies.
Achieves competitive motion understanding performance.
Abstract
3D human motion prediction aims to generate coherent future motions from observed sequences, yet existing end-to-end regression frameworks often fail to capture complex dynamics and tend to produce temporally inconsistent or static predictions-a limitation rooted in representation shortcutting, where models rely on superficial cues rather than learning meaningful motion structure. We propose a two-stage self-supervised framework that decouples representation learning from prediction. In the pretraining stage, the model performs unified past-future self-reconstruction, reconstructing the past sequence while recovering masked joints in the future sequence under full historical guidance. A velocity-based masking strategy selects highly dynamic joints, forcing the model to focus on informative motion components and internalize the statistical dependencies between past and future states…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Gait Recognition and Analysis · Video Surveillance and Tracking Methods
MethodsFocus
