# AdaptiveShape: Solving Shape Variability for 3D Object Detection with   Geometry Aware Anchor Distributions

**Authors:** Benjamin Sick, Michael Walter, Jochen Abhau

arXiv: 2302.14522 · 2023-03-01

## TL;DR

AdaptiveShape introduces shape-aware anchor distributions for 3D object detection, significantly improving detection accuracy for complex vehicle shapes like semi-trailers, and includes a novel LiDAR-camera fusion method for broader robotic applications.

## Contribution

The paper presents AdaptiveShape, a novel shape-aware anchor distribution approach, and a new fast LiDAR-camera fusion method, enhancing detection of complex shapes and robustness in robotic perception.

## Key findings

- +10.9% AP improvement for large vehicles
- Effective LiDAR-camera fusion without perfect calibration
- Enhanced learning of complex object movements

## Abstract

3D object detection with point clouds and images plays an important role in perception tasks such as autonomous driving. Current methods show great performance on detection and pose estimation of standard-shaped vehicles but lack behind on more complex shapes as e.g. semi-trailer truck combinations. Determining the shape and motion of those special vehicles accurately is crucial in yard operation and maneuvering and industrial automation applications. This work introduces several new methods to improve and measure the performance for such classes. State-of-the-art methods are based on predefined anchor grids or heatmaps for ground truth targets. However, the underlying representations do not take the shape of different sized objects into account. Our main contribution, AdaptiveShape, uses shape aware anchor distributions and heatmaps to improve the detection capabilities. For large vehicles we achieve +10.9% AP in comparison to current shape agnostic methods. Furthermore we introduce a new fast LiDAR-camera fusion. It is based on 2D bounding box camera detections which are available in many processing pipelines. This fusion method does not rely on perfectly calibrated or temporally synchronized systems and is therefore applicable to a broad range of robotic applications. We extend a standard point pillar network to account for temporal data and improve learning of complex object movements. In addition we extended a ground truth augmentation to use grouped object pairs to further improve truck AP by +2.2% compared to conventional augmentation.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.14522/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/2302.14522/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/2302.14522/full.md

---
Source: https://tomesphere.com/paper/2302.14522