Context-based Motion Retrieval using Open Vocabulary Methods for Autonomous Driving
Stefan Englmeier, Max A. B\"uttner, Katharina Winter, and Fabian B. Flohr

TL;DR
This paper introduces a novel context-aware motion retrieval framework for autonomous driving, combining multimodal embeddings and a new dataset, WayMoCo, to improve retrieval accuracy of complex human behaviors in driving scenarios.
Contribution
The paper presents a new multimodal motion retrieval method using SMPL-based sequences and introduces the WayMoCo dataset for evaluating such retrieval in autonomous driving.
Findings
Outperforms state-of-the-art models by up to 27.5% accuracy in retrieval tasks.
Enables scalable retrieval of human behavior and context via natural language queries.
Provides a new dataset, WayMoCo, for evaluating motion-context retrieval in autonomous driving.
Abstract
Autonomous driving systems must operate reliably in safety-critical scenarios, particularly those involving unusual or complex behavior by Vulnerable Road Users (VRUs). Identifying these edge cases in driving datasets is essential for robust evaluation and generalization, but retrieving such rare human behavior scenarios within the long tail of large-scale datasets is challenging. To support targeted evaluation of autonomous driving systems in diverse, human-centered scenarios, we propose a novel context-aware motion retrieval framework. Our method combines Skinned Multi-Person Linear (SMPL)-based motion sequences and corresponding video frames before encoding them into a shared multimodal embedding space aligned with natural language. Our approach enables the scalable retrieval of human behavior and their context through text queries. This work also introduces our dataset WayMoCo, an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
