TS3-Codec: Transformer-Based Simple Streaming Single Codec
Haibin Wu, Naoyuki Kanda, Sefik Emre Eskimez, Jinyu Li

TL;DR
TS3-Codec is a novel transformer-based neural audio codec that achieves high-quality audio compression with less computation and bitrate, outperforming convolution-based codecs in streaming scenarios.
Contribution
Introduces TS3-Codec, a purely transformer-based, convolution-free neural audio codec that simplifies architecture while maintaining or improving performance.
Findings
Achieves comparable or better audio quality than convolution-based codecs.
Uses only 12% of the computation of traditional models.
Reduces bitrate by 77% while maintaining quality.
Abstract
Neural audio codecs (NACs) have garnered significant attention as key technologies for audio compression as well as audio representation for speech language models. While mainstream NAC models are predominantly convolution-based, the performance of NACs with a purely transformer-based, and convolution-free architecture remains unexplored. This paper introduces TS3-Codec, a Transformer-Based Simple Streaming Single Codec. TS3-Codec consists of only a stack of transformer layers with a few linear layers, offering greater simplicity and expressiveness by fully eliminating convolution layers that require careful hyperparameter tuning and large computations. Under the streaming setup, the proposed TS3-Codec achieves comparable or superior performance compared to the codec with state-of-the-art convolution-based architecture while requiring only 12% of the computation and 77% of bitrate.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Digital Filter Design and Implementation
MethodsSoftmax · Attention Is All You Need · Convolution
