MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion
Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla

TL;DR
MoRAG introduces a multi-fusion retrieval-augmented generation approach that enhances human motion generation by improving retrieval accuracy, diversity, and generalization through multi-part strategies and large language model prompting.
Contribution
The paper presents a novel multi-part retrieval strategy combined with LLM prompting to improve motion retrieval and generation, especially for unseen text descriptions.
Findings
Enhanced motion generation performance demonstrated in experiments.
Effective handling of spelling errors and rephrasing in retrieval.
Improved generalizability across language space.
Abstract
We introduce MoRAG, a novel multi-part fusion based retrieval-augmented generation strategy for text-based human motion generation. The method enhances motion diffusion models by leveraging additional knowledge obtained through an improved motion retrieval process. By effectively prompting large language models (LLMs), we address spelling errors and rephrasing issues in motion retrieval. Our approach utilizes a multi-part retrieval strategy to improve the generalizability of motion retrieval across the language space. We create diverse samples through the spatial composition of the retrieved motions. Furthermore, by utilizing low-level, part-specific motion information, we can construct motion samples for unseen text descriptions. Our experiments demonstrate that our framework can serve as a plug-and-play module, improving the performance of motion diffusion models. Code, pretrained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGait Recognition and Analysis · Hand Gesture Recognition Systems · Human Pose and Action Recognition
MethodsDiffusion
