How do Hyenas deal with Human Speech? Speech Recognition and Translation   with ConfHyena

Marco Gaido; Sara Papi; Matteo Negri; Luisa Bentivogli

arXiv:2402.13208·cs.CL·February 21, 2024·1 cites

How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena

Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli

PDF

Open Access 1 Repo

TL;DR

This paper introduces ConfHyena, a speech processing model that replaces traditional attention with Hyena-based mechanisms, significantly reducing training time with minimal impact on accuracy in speech recognition and translation tasks.

Contribution

It presents ConfHyena, a novel Conformer model utilizing Hyena for efficient speech processing, addressing the computational challenges of long sequence attention.

Findings

01

ConfHyena reduces training time by 27%.

02

Minimal quality degradation (~1%) in speech recognition and translation.

03

Performance remains statistically comparable to traditional models.

Abstract

The attention mechanism, a cornerstone of state-of-the-art neural models, faces computational hurdles in processing long sequences due to its quadratic complexity. Consequently, research efforts in the last few years focused on finding more efficient alternatives. Among them, Hyena (Poli et al., 2023) stands out for achieving competitive results in both language modeling and image classification, while offering sub-quadratic memory and computational complexity. Building on these promising results, we propose ConfHyena, a Conformer whose encoder self-attentions are replaced with an adaptation of Hyena for speech processing, where the long input sequences cause high computational costs. Through experiments in automatic speech recognition (for English) and translation (from English into 8 target languages), we show that our best ConfHyena model significantly reduces the training time by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hlt-mt/fbk-fairseq
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and cultural evolution