Memory-efficient Speech Recognition on Smart Devices
Ganesh Venkatesh, Alagappan Valliappan, Jay Mahadeokar, Yuan, Shangguan, Christian Fuegen, Michael L. Seltzer, Vikas Chandra

TL;DR
This paper introduces optimized transducer speech recognition models for smart devices that significantly reduce off-chip memory access and model size, enhancing energy efficiency and usability on low-power devices.
Contribution
It presents novel architectural and recurrent cell design improvements that cut off-chip memory accesses by 4.5x and halve model size with minimal accuracy loss.
Findings
Memory access dominates energy consumption in transducer models.
Model architecture influences off-chip memory access more than size alone.
Optimizations reduce memory accesses by 4.5x and model size by 2x.
Abstract
Recurrent transducer models have emerged as a promising solution for speech recognition on the current and next generation smart devices. The transducer models provide competitive accuracy within a reasonable memory footprint alleviating the memory capacity constraints in these devices. However, these models access parameters from off-chip memory for every input time step which adversely effects device battery life and limits their usability on low-power devices. We address transducer model's memory access concerns by optimizing their model architecture and designing novel recurrent cell designs. We demonstrate that i) model's energy cost is dominated by accessing model weights from off-chip memory, ii) transducer model architecture is pivotal in determining the number of accesses to off-chip memory and just model size is not a good proxy, iii) our transducer model optimizations and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
