# VTC-Net: A Semantic Segmentation Network for Ore Particles Integrating Transformer and Convolutional Block Attention Module (CBAM)

**Authors:** Yijing Wu, Weinong Liang, Jiandong Fang, Chunxia Zhou, Xiaolu Sun

PMC · DOI: 10.3390/s26030787 · Sensors (Basel, Switzerland) · 2026-01-24

## TL;DR

This paper introduces VTC-Net, a new image segmentation model that improves accuracy in analyzing ore particle sizes for mineral processing.

## Contribution

VTC-Net combines Transformer modules and CBAM with VGG16 to enhance segmentation of ore particles with complex features.

## Key findings

- VTC-Net achieved 89.90% MIoU and 96.80% pixel accuracy on ore image datasets.
- The model outperformed UNet and DeepLabV3 in handling multi-scale and occluded ore particles.
- Ablation studies confirmed the effectiveness of the Transformer and CBAM modules.

## Abstract

In mineral processing, visual-based online particle size analysis systems depend on high-precision image segmentation to accurately quantify ore particle size distribution, thereby optimizing crushing and sorting operations. However, due to multi-scale variations, severe adhesion, and occlusion within ore particle clusters, existing segmentation models often exhibit undersegmentation and misclassification, leading to blurred boundaries and limited generalization. To address these challenges, this paper proposes a novel semantic segmentation model named VTC-Net. The model employs VGG16 as the backbone encoder, integrates Transformer modules in deeper layers to capture global contextual dependencies, and incorporates a Convolutional Block Attention Module (CBAM) at the fourth stage to enhance focus on critical regions such as adhesion edges. BatchNorm layers are used to stabilize training. Experiments on ore image datasets show that VTC-Net outperforms mainstream models such as UNet and DeepLabV3 in key metrics, including MIoU (89.90%) and pixel accuracy (96.80%). Ablation studies confirm the effectiveness and complementary role of each module. Visual analysis further demonstrates that the model identifies ore contours and adhesion areas more accurately, significantly improving segmentation robustness and precision under complex operational conditions.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12899373/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12899373/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/PMC12899373/full.md

---
Source: https://tomesphere.com/paper/PMC12899373