Comparative Efficiency Analysis of Lightweight Transformer Models: A Multi-Domain Empirical Benchmark for Enterprise NLP Deployment
Muhammad Shahmeer Khan

TL;DR
This paper compares lightweight Transformer models—DistilBERT, MiniLM, and ALBERT—across multiple enterprise NLP tasks, evaluating their performance and efficiency trade-offs to guide deployment choices.
Contribution
It provides a comprehensive empirical benchmark of three prominent lightweight Transformer models across multiple domains, highlighting their strengths and trade-offs for enterprise NLP applications.
Findings
ALBERT achieves highest accuracy in several tasks
MiniLM offers fastest inference and throughput
DistilBERT provides consistent accuracy with good efficiency
Abstract
In the rapidly evolving landscape of enterprise natural language processing (NLP), the demand for efficient, lightweight models capable of handling multi-domain text automation tasks has intensified. This study conducts a comparative analysis of three prominent lightweight Transformer models - DistilBERT, MiniLM, and ALBERT - across three distinct domains: customer sentiment classification, news topic classification, and toxicity and hate speech detection. Utilizing datasets from IMDB, AG News, and the Measuring Hate Speech corpus, we evaluated performance using accuracy-based metrics including accuracy, precision, recall, and F1-score, as well as efficiency metrics such as model size, inference time, throughput, and memory usage. Key findings reveal that no single model dominates all performance dimensions. ALBERT achieves the highest task-specific accuracy in multiple domains, MiniLM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Topic Modeling · Sentiment Analysis and Opinion Mining
