Transformer Architecture for NetsDB

Subodh Kamble; Kunal Sunil Kasodekar

arXiv:2405.04807·cs.CV·May 10, 2024

Transformer Architecture for NetsDB

Subodh Kamble, Kunal Sunil Kasodekar

PDF

Open Access 1 Repo

TL;DR

This paper presents an end-to-end implementation of the Transformer encoder for NetsDB, enabling efficient deployment and inference of transformer models within relational database systems.

Contribution

It introduces a complete Transformer encoder implementation integrated with NetsDB, optimized for distributed processing and model serving.

Findings

01

The implementation achieves competitive inference times.

02

It demonstrates reduced model size compared to other frameworks.

03

The approach facilitates efficient deployment of transformer models in database systems.

Abstract

Transformers models have become the backbone of the current state-of-the-art models in language, vision, and multimodal domains. These models, at their core, utilize multi-head self-attention to selectively aggregate context, generating dynamic contextual embeddings and modeling long-range dependencies for a clear contextual understanding. Lixi et al. \cite{zhou2022serving} proposed a method to use relational databases for deploying large-scale deep learning models and created an open-source implementation called NetsDB for the same. We build upon the previous work of these authors by creating an end-to-end implementation of the Encoder part of the transformer for model serving in NetsDB. Specifically, we construct a two-block encoder that includes Multi-Head Attention and its accompanying self-attention mechanism, Layer-Norm, Dropout, FeedForward Layers, and the necessary residual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kn0wthing/netsdb
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Optical Network Technologies · Cloud Computing and Resource Management · Software-Defined Networks and 5G

MethodsAttention Is All You Need · Softmax · Linear Layer · Multi-Head Attention · Dropout