A 23 $\mu$W Keyword Spotting IC with Ring-Oscillator-Based Time-Domain Feature Extraction
Kwantae Kim, Chang Gao, Rui Gra\c{c}a, Ilya Kiselev, Hoi-Jun Yoo, Tobi, Delbruck, Shih-Chii Liu

TL;DR
This paper introduces a low-power, fully time-domain keyword spotting integrated circuit that leverages ring-oscillator-based feature extraction, achieving high accuracy and low latency suitable for edge applications.
Contribution
It presents the first KWS IC using ring-oscillator-based time-domain feature extraction, demonstrating improved scalability and power efficiency in a 65 nm CMOS process.
Findings
Achieves 86% accuracy on Google Speech Command Dataset
Consumes only 23 μW power in total
Operates with 12.4 ms latency
Abstract
This article presents the first keyword spotting (KWS) IC which uses a ring-oscillator-based time-domain processing technique for its analog feature extractor (FEx). Its extensive usage of time-encoding schemes allows the analog audio signal to be processed in a fully time-domain manner except for the voltage-to-time conversion stage of the analog front-end. Benefiting from fundamental building blocks based on digital logic gates, it offers a better technology scalability compared to conventional voltage-domain designs. Fabricated in a 65 nm CMOS process, the prototyped KWS IC occupies 2.03mm and dissipates 23 W power consumption including analog FEx and digital neural network classifier. The 16-channel time-domain FEx achieves 54.89 dB dynamic range for 16 ms frame shift size while consuming 9.3 W. The measurement result verifies that the proposed IC performs a 12-class…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
