HELIOS: Harmonizing Early Fusion, Late Fusion, and LLM Reasoning for Multi-Granular Table-Text Retrieval
Sungho Park, Joohyung Yun, Jongwuk Lee, Wook-Shin Han

TL;DR
HELIOS introduces a novel multi-granular retrieval framework combining edge-based subgraph retrieval, node expansion, and LLM refinement to improve table-text retrieval accuracy and reasoning capabilities in open-domain QA.
Contribution
The paper proposes HELIOS, a hybrid approach that integrates early and late fusion techniques with LLM reasoning, addressing limitations of existing methods in multi-hop and complex reasoning tasks.
Findings
Achieves up to 42.6% recall improvement on OTT-QA.
Outperforms state-of-the-art models in nDCG by 39.9%.
Effectively supports advanced reasoning like multi-hop and column-wise aggregation.
Abstract
Table-text retrieval aims to retrieve relevant tables and text to support open-domain question answering. Existing studies use either early or late fusion, but face limitations. Early fusion pre-aligns a table row with its associated passages, forming "stars," which often include irrelevant contexts and miss query-dependent relationships. Late fusion retrieves individual nodes, dynamically aligning them, but it risks missing relevant contexts. Both approaches also struggle with advanced reasoning tasks, such as column-wise aggregation and multi-hop reasoning. To address these issues, we propose HELIOS, which combines the strengths of both approaches. First, the edge-based bipartite subgraph retrieval identifies finer-grained edges between table segments and passages, effectively avoiding the inclusion of irrelevant contexts. Then, the query-relevant node expansion identifies the most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Information Retrieval and Search Behavior · Advanced Graph Neural Networks
