# BoxCars: Improving Fine-Grained Recognition of Vehicles using 3-D   Bounding Boxes in Traffic Surveillance

**Authors:** Jakub Sochor, Jakub \v{S}pa\v{n}hel, Adam Herout

arXiv: 1703.00686 · 2019-03-13

## TL;DR

This paper introduces a novel approach for fine-grained vehicle recognition in traffic surveillance using 3-D bounding boxes to normalize viewpoints, combined with data augmentation techniques, resulting in significant accuracy improvements.

## Contribution

The paper presents a new 3-D bounding box method for viewpoint normalization and a data augmentation strategy, enhancing CNN performance in vehicle recognition from diverse viewpoints.

## Key findings

- Accuracy increased by up to 12 percentage points.
- Error rate reduced by up to 50%.
- Outperforms existing state-of-the-art methods.

## Abstract

In this paper, we focus on fine-grained recognition of vehicles mainly in traffic surveillance applications. We propose an approach that is orthogonal to recent advancements in fine-grained recognition (automatic part discovery and bilinear pooling). In addition, in contrast to other methods focused on fine-grained recognition of vehicles, we do not limit ourselves to a frontal/rear viewpoint, but allow the vehicles to be seen from any viewpoint. Our approach is based on 3-D bounding boxes built around the vehicles. The bounding box can be automatically constructed from traffic surveillance data. For scenarios where it is not possible to use precise construction, we propose a method for an estimation of the 3-D bounding box. The 3-D bounding box is used to normalize the image viewpoint by "unpacking" the image into a plane. We also propose to randomly alter the color of the image and add a rectangle with random noise to a random position in the image during the training of convolutional neural networks (CNNs). We have collected a large fine-grained vehicle data set BoxCars116k, with 116k images of vehicles from various viewpoints taken by numerous surveillance cameras. We performed a number of experiments, which show that our proposed method significantly improves CNN classification accuracy (the accuracy is increased by up to 12% points and the error is reduced by up to 50% compared with CNNs without the proposed modifications). We also show that our method outperforms the state-of-the-art methods for fine-grained recognition.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.00686/full.md

## Figures

51 figures with captions in the complete paper: https://tomesphere.com/paper/1703.00686/full.md

## References

74 references — full list in the complete paper: https://tomesphere.com/paper/1703.00686/full.md

---
Source: https://tomesphere.com/paper/1703.00686