Autofocus Retrieval: An Effective Pipeline for Multi-Hop Question Answering With Semi-Structured Knowledge

Derian Boer; Stephen Roth; Stefan Kramer

arXiv:2505.09246·cs.IR·May 15, 2026

Autofocus Retrieval: An Effective Pipeline for Multi-Hop Question Answering With Semi-Structured Knowledge

Derian Boer, Stephen Roth, Stefan Kramer

PDF

1 Repo

TL;DR

Autofocus-Retriever is a modular framework that enhances multi-hop question answering by integrating structured and unstructured data retrieval, achieving state-of-the-art results across multiple benchmarks.

Contribution

It introduces a novel, hybrid retrieval pipeline leveraging large language models and incremental scope expansion for improved multi-hop QA performance.

Findings

01

Surpasses all benchmarks with best zero- and one-shot results.

02

Average first-hit rate exceeds second-best by 32.1%.

03

Component analysis reveals effectiveness of hybrid retrieval and LLM reranking.

Abstract

In many real-world settings, machine learning models and interactive systems have access to both structured knowledge, e.g., knowledge graphs or tables, and unstructured content, e.g., natural language documents. Yet, most rely on either. Semi-Structured Knowledge Bases (SKBs) bridge this gap by linking unstructured content to nodes within structured data. In this work, we present Autofocus-Retriever (AF-Retriever), a modular framework for SKB-based, multi-hop question answering. It combines structural and textual retrieval through novel integration steps and optimizations, achieving the best zero- and one-shot results across all three STaRK QA benchmarks, which span diverse domains and evaluation metrics. AF-Retriever's average first-hit rate surpasses the second-best method by 32.1%. Its performance is driven by (1) leveraging exchangeable large language models (LLMs) to extract…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kramerlab/AF-Retriever
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.