LEO: Graph Attention Network based Hybrid Multi Sensor Extended Object Fusion and Tracking for Autonomous Driving Applications

Mayank Mayank; Bharanidhar Duraisamy; Florian Geiss

arXiv:2604.02206·cs.LG·April 3, 2026

LEO: Graph Attention Network based Hybrid Multi Sensor Extended Object Fusion and Tracking for Autonomous Driving Applications

Mayank Mayank, Bharanidhar Duraisamy, Florian Geiss

PDF

TL;DR

LEO introduces a graph attention network that fuses multi-sensor data for accurate, real-time shape and trajectory estimation of dynamic objects in autonomous driving.

Contribution

It combines classical Bayesian models with deep learning to adaptively fuse sensor data, modeling complex geometries and ensuring robustness across diverse scenarios.

Findings

01

Achieves real-time efficiency suitable for production systems.

02

Models complex geometries like articulated trucks and trailers.

03

Generalizes well across different sensor types and datasets.

Abstract

Accurate shape and trajectory estimation of dynamic objects is essential for reliable automated driving. Classical Bayesian extended-object models offer theoretical robustness and efficiency but depend on completeness of a-priori and update-likelihood functions, while deep learning methods bring adaptability at the cost of dense annotations and high compute. We bridge these strengths with LEO (Learned Extension of Objects), a spatio-temporal Graph Attention Network that fuses multi-modal production-grade sensor tracks to learn adaptive fusion weights, ensure temporal consistency, and represent multi-scale shapes. Using a task-specific parallelogram ground-truth formulation, LEO models complex geometries (e.g. articulated trucks and trailers) and generalizes across sensor types, configurations, object classes, and regions, remaining robust for challenging and long-range targets.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.