HyST: LLM-Powered Hybrid Retrieval over Semi-Structured Tabular Data
Jiyoon Myung, Jihyeon Park, Joohyung Han

TL;DR
HyST is a hybrid retrieval framework that combines LLM-driven structured filtering with semantic search to improve retrieval accuracy over semi-structured tabular data in recommendation systems.
Contribution
HyST introduces a novel hybrid retrieval approach that integrates LLM-extracted attribute constraints with embedding search, enhancing retrieval precision for complex user queries.
Findings
HyST outperforms traditional baselines in semi-structured benchmarks.
Structured filtering significantly improves retrieval precision.
The framework is scalable and effective for real-world queries.
Abstract
User queries in real-world recommendation systems often combine structured constraints (e.g., category, attributes) with unstructured preferences (e.g., product descriptions or reviews). We introduce HyST (Hybrid retrieval over Semi-structured Tabular data), a hybrid retrieval framework that combines LLM-powered structured filtering with semantic embedding search to support complex information needs over semi-structured tabular data. HyST extracts attribute-level constraints from natural language using large language models (LLMs) and applies them as metadata filters, while processing the remaining unstructured query components via embedding-based retrieval. Experiments on a semi-structured benchmark show that HyST consistently outperforms tradtional baselines, highlighting the importance of structured filtering in improving retrieval precision, offering a scalable and accurate solution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
