Narrowing the Coordinate-frame Gap in Behavior Prediction Models:   Distillation for Efficient and Accurate Scene-centric Motion Forecasting

DiJia Su; Bertrand Douillard; Rami Al-Rfou; Cheolho Park; Benjamin; Sapp

arXiv:2206.03970·cs.CV·June 13, 2022

Narrowing the Coordinate-frame Gap in Behavior Prediction Models: Distillation for Efficient and Accurate Scene-centric Motion Forecasting

DiJia Su, Bertrand Douillard, Rami Al-Rfou, Cheolho Park, Benjamin, Sapp

PDF

Open Access

TL;DR

This paper introduces knowledge distillation techniques to enhance scene-centric motion prediction models, achieving performance close to agent-centric models while maintaining higher efficiency and scalability in autonomous driving scenarios.

Contribution

It develops distillation methods that significantly improve scene-centric models, narrowing the performance gap with agent-centric models in motion forecasting.

Findings

01

Scene-centric models improved by 13.2% on Argoverse

02

Achieved 7.8% improvement on Waymo dataset

03

Scene-centric models are up to 15 times more efficient in busy scenes

Abstract

Behavior prediction models have proliferated in recent years, especially in the popular real-world robotics application of autonomous driving, where representing the distribution over possible futures of moving agents is essential for safe and comfortable motion planning. In these models, the choice of coordinate frames to represent inputs and outputs has crucial trade offs which broadly fall into one of two categories. Agent-centric models transform inputs and perform inference in agent-centric coordinates. These models are intrinsically invariant to translation and rotation between scene elements, are best-performing on public leaderboards, but scale quadratically with the number of agents and scene elements. Scene-centric models use a fixed coordinate system to process all agents. This gives them the advantage of sharing representations among all agents, offering efficient amortized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Autonomous Vehicle Technology and Safety

MethodsKnowledge Distillation