TranSQL+: Serving Large Language Models with SQL on Low-Resource Hardware

Wenbo Sun; Qiming Guo; Wenlu Wang; Rihan Hai

arXiv:2502.02818·cs.DB·September 23, 2025

TranSQL+: Serving Large Language Models with SQL on Low-Resource Hardware

Wenbo Sun, Qiming Guo, Wenlu Wang, Rihan Hai

PDF

Open Access

TL;DR

This paper presents TranSQL+, a method to run large language models efficiently on low-resource hardware by translating their computation graphs into SQL queries, enabling faster inference without external libraries.

Contribution

Introducing TranSQL+, a novel template-based code generator that converts LLM graphs into SQL for resource-efficient inference on relational databases.

Findings

01

Achieves up to 20x lower prefill latency

02

Attains 4x higher decoding speed

03

Effective on low-memory, CPU-only hardware

Abstract

Deploying Large Language Models (LLMs) on resource-constrained devices remains challenging due to limited memory, lack of GPUs, and the complexity of existing runtimes. In this paper, we introduce TranSQL+, a template-based code generator that translates LLM computation graphs into pure SQL queries for execution in relational databases. Without relying on external libraries, TranSQL+, leverages mature database features, such as vectorized execution and out-of-core processing, for efficient inference. We further propose a row-to-column (ROW2COL) optimization that improves join efficiency in matrix operations. Evaluated on Llama3-8B and DeepSeekMoE models, TranSQL+ achieves up to 20x lower prefill latency and 4x higher decoding speed compared to DeepSpeed Inference and Llama.cpp in low-memory and CPU-only configurations. Our results highlight relational databases as a practical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematics, Computing, and Information Processing · Library Science and Information Systems · Digital Rights Management and Security

MethodsSoftmax · Attention Is All You Need