# CAs-Net: A Channel-Aware Speech Network for Uyghur Speech Recognition

**Authors:** Jiang Zhang, Miaomiao Xu, Lianghui Xu, Yajing Ma

PMC · DOI: 10.3390/s25123783 · Sensors (Basel, Switzerland) · 2025-06-17

## TL;DR

This paper introduces a new speech recognition model for low-resource languages like Uyghur, achieving better performance in noisy environments.

## Contribution

The novel CAs-Net model introduces a channel-aware architecture with two new modules for improved speech recognition in low-resource settings.

## Key findings

- CAs-Net achieved an average Word Error Rate of 5.72% on a Uyghur speech dataset.
- The model outperformed existing approaches in noisy and low-resource conditions.

## Abstract

This paper proposes a Channel-Aware Speech Network (CAs-Net) for low-resource speech recognition tasks, aiming to improve recognition performance for languages such as Uyghur under complex noisy conditions. The proposed model consists of two key components: (1) the Channel Rotation Module (CIM), which reconstructs each frame’s channel vector into a spatial structure and applies a rotation operation to explicitly model the local structural relationships within the channel dimension, thereby enhancing the encoder’s contextual modeling capability; and (2) the Multi-Scale Depthwise Convolution Module (MSDCM), integrated within the Transformer framework, which leverages multi-branch depthwise separable convolutions and a lightweight self-attention mechanism to jointly capture multi-scale temporal patterns, thus improving the model’s perception of compact articulation and complex rhythmic structures. Experiments conducted on a real Uyghur speech recognition dataset demonstrate that CAs-Net achieves the best performance across multiple subsets, with an average Word Error Rate (WER) of 5.72%, significantly outperforming existing approaches. These results validate the robustness and effectiveness of the proposed model under low-resource and noisy conditions.

## Full-text entities

- **Diseases:** injury to (MESH:D014947), CIM (MESH:D009759)
- **Chemicals:** ASR (-), BUS (MESH:D002066)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12196623/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12196623/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/PMC12196623/full.md

---
Source: https://tomesphere.com/paper/PMC12196623