# The TDGL Module: A Fast Multi-Scale Vision Sensor Based on a Transformation Dilated Grouped Layer

**Authors:** Leilei Xie, Fenghua Zhu, Zhixue Wang

PMC · DOI: 10.3390/s25113339 · Sensors (Basel, Switzerland) · 2025-05-26

## TL;DR

This paper introduces a new vision sensor module for road object detection that improves efficiency and accuracy in autonomous vehicles.

## Contribution

The novel TDGL module enhances multi-scale feature extraction using a modified convolution layer with dilation and grouping.

## Key findings

- The TDGL module achieves 40.3% mAP on the BDD100K dataset with 3.1M parameters.
- The optimized TDGL net reaches 58 FPS inference speed, suitable for real-time road obstacle detection.

## Abstract

Effectively capturing multi-scale object features is crucial for vision sensors used in road object detection tasks. Traditional spatial pyramid pooling methods fuse multi-scale feature information but lack adaptability in dynamically adjusting convolution operations based on their actual needs. This limitation prevents them from fully utilizing spatial hierarchies and contextual information. To address this challenge, we propose a Transformation Dilated Grouped Layer (TDGL) module, a fast multi-scale vision sensor based on deep learning, designed to enhance both efficiency and accuracy in road target feature extraction networks. The TDGL is built upon the Global Layer Normalization Convolution (GLConv) unit, which mitigates internal covariate shift by introducing scaling and offset parameters, modifying dilation strategies, and employing grouped convolution. These improvements enable the network to distinguish features at different scales effectively while optimizing spatial information processing and reducing computational costs. To validate its effectiveness, we integrate the TDGL module into the backbone of several YOLO models, forming the TDGL Net feature extractor. The experimental results obtained on the BDD100K dataset show that the mAP of the TDGL net reaches 40.3% with around 3.1M parameters. The inference speed of the TDGL net after transformation optimization reaches 58 FPS, which meets the requirement for the real-time detection of road obstacle targets by autonomous vehicles.

## Full-text entities

- **Diseases:** injury to (MESH:D014947)
- **Chemicals:** YOLO (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12158040/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12158040/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/PMC12158040/full.md

---
Source: https://tomesphere.com/paper/PMC12158040