Optimizing Context-Enhanced Relational Joins
Viktor Sanca, Manos Chatzakis, Anastasia Ailamaki

TL;DR
This paper introduces a novel embedding operator for relational databases that integrates context-rich vector data processing with traditional relational joins, enabling optimized hybrid data analysis.
Contribution
It proposes a new context-enhanced relational join operator compatible with relational algebra, facilitating efficient hybrid processing of relational and vector data.
Findings
Order of magnitude execution time improvement
Successful integration of vector embeddings with relational joins
Demonstrated optimization benefits through example with string embeddings
Abstract
Collecting data, extracting value, and combining insights from relational and context-rich multi-modal sources in data processing pipelines presents a challenge for traditional relational DBMS. While relational operators allow declarative and optimizable query specification, they are limited to data transformations unsuitable for capturing or analyzing context. On the other hand, representation learning models can map context-rich data into embeddings, allowing machine-automated context processing but requiring imperative data transformation integration with the analytical query. To bridge this dichotomy, we present a context-enhanced relational join and introduce an embedding operator composable with relational operators. This enables hybrid relational and context-rich vector data processing, with algebraic equivalences compatible with relational algebra and corresponding logical and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Semantic Web and Ontologies · Advanced Database Systems and Queries
