OmniTQA: A Cost-Aware System for Hybrid Query Processing over Semi-Structured Data
Nima Shahbazi, Seiji Maekawa, Nikita Bhutani, Estevam Hruschka

TL;DR
OmniTQA is a cost-aware hybrid query system that efficiently combines LLM-based semantic reasoning with traditional database operations for semi-structured data.
Contribution
It introduces a novel framework integrating semantic reasoning as a first-class operator with data-aware planning and dynamic task routing.
Findings
Outperforms existing methods in accuracy and cost efficiency.
Achieves significant improvements on complex, large, and multi-relation queries.
Demonstrates scalability and effectiveness across diverse benchmarks.
Abstract
While recent advances in large language models have significantly improved Text-to-SQL and table question answering systems, most existing approaches assume that all query-relevant information is explicitly represented in structured schemas. In practice, many enterprise databases contain hybrid schemas where structured attributes coexist with free-form textual fields, requiring systems to reason over both types of information. To address this challenge, we introduce OmniTQA, a cost-aware hybrid query processing framework that operates over both structured and semi-structured data. OmniTQA treats semantic reasoning as a first-class query operator, seamlessly integrating LLM-based semantic operations with classical relational operators into an executable directed acyclic graph. To manage the high latency and cost of LLM inference, it extends classical query optimization with data-aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
