# Wavelet-CNet: Wavelet Cross Fusion and Detail Enhancement Network for RGB-Thermal Semantic Segmentation

**Authors:** Wentao Zhang, Qi Zhang, Yue Yan

PMC · DOI: 10.3390/s26031067 · Sensors (Basel, Switzerland) · 2026-02-06

## TL;DR

This paper introduces Wavelet-CNet, a new network for RGB-thermal semantic segmentation that improves feature fusion and detail enhancement.

## Contribution

The novel Wavelet Cross Fusion Module and Cross-Scale Detail Enhancement Module enhance RGB-thermal feature interaction.

## Key findings

- Wavelet-CNet achieves 58.3% mIoU on MFNet and 85.77% on PST900.
- Ablation studies confirm the effectiveness of the proposed modules.
- The method improves feature complementarity and global localization.

## Abstract

Leveraging thermal infrared imagery to complement RGB spatial information is a key technology in industrial sensing. This technology enables mobile devices to perform scene understanding through RGB-T semantic segmentation. However, existing networks conduct only limited information interaction between modalities and lack specific designs to exploit the thermal aggregation entropy of the thermal modality, resulting in inefficient feature complementarity within bilateral structures. To address these challenges, we propose Wavelet-CNet for RGB-T semantic segmentation. Specifically, we design a Wavelet Cross Fusion Module (WCFM) that applies wavelet transforms to separately extract four types of low- and high-frequency information from RGB and thermal features, which are then fed back into attention mechanisms for dual-modal feature reconstruction. Furthermore, a Cross-Scale Detail Enhancement Module (CSDEM) introduces cross-scale contextual information from the TIR branch into each fusion stage, aligning global localization through contour information from thermal features. Wavelet-CNet achieves competitive mIoU scores of 58.3% and 85.77% on MFNet and PST900, respectively, while ablation studies on MFNet further validate the effectiveness of the proposed WCFM and CSDEM modules.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12900084/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12900084/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/PMC12900084/full.md

---
Source: https://tomesphere.com/paper/PMC12900084