Processing Natural Language on Embedded Devices: How Well Do Transformer Models Perform?
Souvika Sarkar, Mohammad Fakhruddin Babar, Md Mahadi Hassan, Monowar, Hasan, and Shubhra Kanti Karmaker Santu

TL;DR
This study evaluates the performance of BERT-based transformer models on various embedded devices, revealing their feasibility for complex NLP tasks without GPUs and providing insights for deployment in resource-constrained environments.
Contribution
The paper provides an empirical analysis of transformer models' performance on embedded hardware, highlighting resource-accuracy trade-offs and practical deployment insights.
Findings
Complex NLP tasks can be executed on embedded systems without GPUs.
Performance varies significantly across hardware configurations.
BERT-based models are deployable on resource-constrained devices.
Abstract
This paper presents a performance study of transformer language models under different hardware configurations and accuracy requirements and derives empirical observations about these resource/accuracy trade-offs. In particular, we study how the most commonly used BERT-based language models (viz., BERT, RoBERTa, DistilBERT, and TinyBERT) perform on embedded systems. We tested them on four off-the-shelf embedded platforms (Raspberry Pi, Jetson, UP2, and UDOO) with 2 GB and 4 GB memory (i.e., a total of eight hardware configurations) and four datasets (i.e., HuRIC, GoEmotion, CoNLL, WNUT17) running various NLP tasks. Our study finds that executing complex NLP tasks (such as "sentiment" classification) on embedded systems is feasible even without any GPUs (e.g., Raspberry Pi with 2 GB of RAM). Our findings can help designers understand the deployability and performance of transformer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Context-Aware Activity Recognition Systems · Advanced Malware Detection Techniques
MethodsMulti-Head Attention · Attention Is All You Need · RoBERTa · Linear Layer · Adam · Attention Dropout · WordPiece · Dense Connections · Dropout · Weight Decay
