Large Language Model-Enhanced Relational Operators: Taxonomy, Benchmark, and Analysis

Yunxiang Su; Tianjing Zeng; Zhongjun Ding; Yin Lin; Rong Zhu; Zhewei Wei; Bolin Ding; Jingren Zhou

arXiv:2603.02537·cs.DB·April 29, 2026

Large Language Model-Enhanced Relational Operators: Taxonomy, Benchmark, and Analysis

Yunxiang Su, Tianjing Zeng, Zhongjun Ding, Yin Lin, Rong Zhu, Zhewei Wei, Bolin Ding, Jingren Zhou

PDF

1 Repo

TL;DR

This paper introduces a unified taxonomy and a comprehensive benchmark for Large Language Model-Enhanced Relational Operators (LROs), analyzing their design, implementation, and performance across diverse datasets.

Contribution

It establishes a unified taxonomy for LROs, creates the LROBench benchmark suite, and provides empirical insights and best practices for designing effective LRO systems.

Findings

01

LROs can be categorized into Select, Match, Impute, Cluster, and Order.

02

LROBench includes 290 single-LRO and 60 multi-LRO queries across 27 databases.

03

Empirical evaluation reveals key design choices and performance trade-offs.

Abstract

With the development of large language models (LLMs), numerous studies integrate LLMs through operator-like components to enhance relational data processing tasks, e.g., filters with semantic predicates, knowledge-augmented table imputation, reasoning-driven entity matching and more challenging semantic query processing. These components invoke LLMs while preserving a relational input/output interface, which we refer to as LLM-Enhanced Relational Operators (LROs). From an operator perspective, unfortunately, these existing LROs suffer from fragmented definition, various implementation strategies and inadequate evaluation benchmarks. To this end, in this paper, we first establish a unified LRO taxonomy to align existing LROs, and categorize them into: Select, Match, Impute, Cluster and Order, along with their operands and implementation variants. Second, we design LROBench, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

LROBench/LROBench
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.