# Object Detection on Road: Vehicle’s Detection Based on Re-Training Models on NVIDIA-Jetson Platform

**Authors:** Sleiter Ramos-Sanchez, Jinmi Lezama, Ricardo Yauri, Joyce Zevallos

PMC · DOI: 10.3390/jimaging12010020 · Journal of Imaging · 2026-01-01

## TL;DR

This paper explores vehicle detection in urban traffic using AI models on embedded systems, finding a balance between accuracy and efficiency in Lima's congested city environment.

## Contribution

The study evaluates SSD-based models on the NVIDIA Jetson platform, identifying the optimal model for vehicle detection in resource-constrained embedded systems.

## Key findings

- VGG16-SSD achieved highest average precision (mAP ≈90.7%) but with longer training time.
- MobileNetV1-SSD (512×512) offered comparable accuracy (mAP ≈90.4%) with shorter training time.
- Contrast adjustment improved detection of minority classes like Tuk-tuk and Motorcycle.

## Abstract

The increasing use of artificial intelligence (AI) and deep learning (DL) techniques has driven advances in vehicle classification and detection applications for embedded devices with deployment constraints due to computational cost and response time. In the case of urban environments with high traffic congestion, such as the city of Lima, it is important to determine the trade-off between model accuracy, type of embedded system, and the dataset used. This study was developed using a methodology adapted from the CRISP-DM approach, which included the acquisition of traffic videos in the city of Lima, their segmentation, and manual labeling. Subsequently, three SSD-based detection models (MobileNetV1-SSD, MobileNetV2-SSD-Lite, and VGG16-SSD) were trained on the NVIDIA Jetson Orin NX 16 GB platform. The results show that the VGG16-SSD model achieved the highest average precision (mAP ≈90.7%), with a longer training time, while the MobileNetV1-SSD (512×512) model achieved comparable performance (mAP ≈90.4%) with a shorter time. Additionally, data augmentation through contrast adjustment improved the detection of minority classes such as Tuk-tuk and Motorcycle. The results indicate that, among the evaluated models, MobileNetV1-SSD (512×512) achieved the best balance between accuracy and computational load for its implementation in ADAS embedded systems in congested urban environments.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12842717/full.md

## Figures

22 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12842717/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/PMC12842717/full.md

---
Source: https://tomesphere.com/paper/PMC12842717