Cost-effective Deployment of BERT Models in Serverless Environment
Katar\'ina Bene\v{s}ov\'a, Andrej \v{S}vec, Marek \v{S}uppa

TL;DR
This paper demonstrates that BERT models can be effectively deployed in serverless environments using knowledge distillation and fine-tuning, achieving acceptable latency and cost-effectiveness for real-world NLP tasks.
Contribution
It introduces a method for deploying domain-tuned BERT models in serverless setups without infrastructure overhead, using knowledge distillation and fine-tuning.
Findings
Achieves acceptable latency for production use
Cost-effective for small-to-medium deployments
No additional infrastructure required
Abstract
In this study we demonstrate the viability of deploying BERT-style models to serverless environments in a production setting. Since the freely available pre-trained models are too large to be deployed in this way, we utilize knowledge distillation and fine-tune the models on proprietary datasets for two real-world tasks: sentiment analysis and semantic textual similarity. As a result, we obtain models that are tuned for a specific domain and deployable in serverless environments. The subsequent performance analysis shows that this solution results in latency levels acceptable for production use and that it is also a cost-effective approach for small-to-medium size deployments of BERT models, all without any infrastructure overhead.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Network Security and Intrusion Detection · Web Data Mining and Analysis
MethodsLinear Layer · Knowledge Distillation · Residual Connection · Layer Normalization · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam · Weight Decay · Dropout · Linear Warmup With Linear Decay
