# A KWS System for Edge-Computing Applications with Analog-Based Feature Extraction and Learned Step Size Quantized Classifier

**Authors:** Yukai Shen, Binyi Wu, Dietmar Straeussnigg, Eric Gutierrez

PMC · DOI: 10.3390/s25082550 · Sensors (Basel, Switzerland) · 2025-04-17

## TL;DR

This paper presents a low-power keyword spotting system for edge devices using analog-based feature extraction and a quantized digital classifier.

## Contribution

The novelty lies in combining analog-based feature extraction with a learned step size quantized GRU classifier for ultra-low-power edge computing.

## Key findings

- The system achieves 91.35% accuracy with minimal performance loss compared to full-precision models.
- The model requires only 34.8 kB memory and 62,400 MAC operations per inference.
- The system remains robust to noise and analog impairments in real-world conditions.

## Abstract

Edge-computing applications demand ultra-low-power architectures for both feature extraction and classification tasks. In this manuscript, a Keyword Spotting (KWS) system tailored for energy-constrained portable environments is proposed. A 16-channel analog filter bank is employed for audio feature extraction, followed by a digital Gated Recurrent Unit (GRU) classifier. The filter bank is behaviorally modeled, making use of second-order band-pass transfer functions, simulating the analog front-end (AFE) processing. To enable efficient deployment, the GRU classifier is trained using a Learned Step Size (LSQ) and Look-Up Table (LUT)-aware quantization method. The resulting quantized model, with 4-bit weights and 8-bit activation functions (W4A8), achieves 91.35% accuracy across 12 classes, including 10 keywords from the Google Speech Command Dataset v2 (GSCDv2), with less than 1% degradation compared to its full-precision counterpart. The model is estimated to require only 34.8 kB of memory and 62,400 multiply–accumulate (MAC) operations per inference in real-time settings. Furthermore, the robustness of the AFE against noise and analog impairments is evaluated by injecting Gaussian noise and perturbing the filter parameters (center frequency and quality factor) in the test data, respectively. The obtained results confirm a strong classification performance even under degraded circuit-level conditions, supporting the suitability of the proposed system for ultra-low-power, noise-resilient edge applications.

## Full-text entities

- **Diseases:** PVT (MESH:D000377), SLU (MESH:D007806), injury to (MESH:D014947), LSTM (MESH:D000088562)
- **Chemicals:** S (MESH:D013455), FSCD (-), silicon (MESH:D012825)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12031143/full.md

## Figures

17 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12031143/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/PMC12031143/full.md

---
Source: https://tomesphere.com/paper/PMC12031143