Re-Parameterization of Lightweight Transformer for On-Device Speech   Emotion Recognition

Zixing Zhang; Zhongren Dong; Weixiang Xu; Jing Han

arXiv:2411.09339·cs.SD·November 15, 2024

Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition

Zixing Zhang, Zhongren Dong, Weixiang Xu, Jing Han

PDF

Open Access

TL;DR

This paper introduces Transformer Re-parameterization, a method to enhance lightweight Transformer models for on-device speech emotion recognition, enabling better performance on resource-limited IoT devices.

Contribution

It proposes a novel re-parameterization technique with HRF and deHRF processes to improve lightweight Transformer performance without increasing inference complexity.

Findings

01

Consistent performance improvements across three Transformer variants.

02

Achieved results comparable to larger models in speech emotion recognition.

03

Enables deployment of advanced models on resource-constrained IoT devices.

Abstract

With the increasing implementation of machine learning models on edge or Internet-of-Things (IoT) devices, deploying advanced models on resource-constrained IoT devices remains challenging. Transformer models, a currently dominant neural architecture, have achieved great success in broad domains but their complexity hinders its deployment on IoT devices with limited computation capability and storage size. Although many model compression approaches have been explored, they often suffer from notorious performance degradation. To address this issue, we introduce a new method, namely Transformer Re-parameterization, to boost the performance of lightweight Transformer models. It consists of two processes: the High-Rank Factorization (HRF) process in the training stage and the deHigh-Rank Factorization (deHRF) process in the inference stage. In the former process, we insert an additional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing

MethodsAttention Is All You Need · Absolute Position Encodings · Label Smoothing · Adam · Residual Connection · Softmax · Linear Layer · Dropout · Layer Normalization · Multi-Head Attention