Streaming Anchor Loss: Augmenting Supervision with Temporal Significance
Utkarsh Oggy Sarawgi, John Berkowitz, Vineet Garg, Arnav Kundu, Minsik, Cho, Sai Srujana Buddi, Saurabh Adya, Ahmed Tewfik

TL;DR
This paper introduces Streaming Anchor Loss (SAL), a novel loss function that emphasizes learning from critical frames in streaming models, improving accuracy and latency without extra data or parameters.
Contribution
The paper proposes SAL, a dynamic loss that enhances learning focus on important frames in streaming models, addressing resource constraints.
Findings
SAL improves task accuracy in streaming speech models.
SAL reduces latency without increasing model complexity.
Experimental results confirm SAL's effectiveness across multiple tasks.
Abstract
Streaming neural network models for fast frame-wise responses to various speech and sensory signals are widely adopted on resource-constrained platforms. Hence, increasing the learning capacity of such streaming models (i.e., by adding more parameters) to improve the predictive power may not be viable for real-world tasks. In this work, we propose a new loss, Streaming Anchor Loss (SAL), to better utilize the given learning capacity by encouraging the model to learn more from essential frames. More specifically, our SAL and its focal variations dynamically modulate the frame-wise cross entropy loss based on the importance of the corresponding frames so that a higher loss penalty is assigned for frames within the temporal proximity of semantically critical events. Therefore, our loss ensures that the model training focuses on predicting the relatively rare but task-relevant frames.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Seismology and Earthquake Studies
