Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA
Andre Bacellar

TL;DR
This paper formalizes the regimes in two-hop QA retrieval, introduces a theoretical framework, and proposes RegimeRouter, a lightweight classifier that improves retrieval performance across multiple datasets.
Contribution
It provides a formal theory for regime-conditional retrieval in two-hop QA and develops a transferable router that enhances retrieval accuracy.
Findings
RegimeRouter improves R@5 by up to 5.6 percentage points on benchmark datasets.
Theoretical analysis reveals key predicates for regime classification and bridge advantage.
Bridge sentence inclusion significantly boosts retrieval performance.
Abstract
Two-hop QA retrieval splits queries into two regimes determined by whether the hop-2 entity is explicitly named in the question (Q-dominant) or only in the bridge passage (B-dominant). We formalize this split with three theorems: (T1) per-query AUC is a monotone function of the cosine separation margin, with R^2 >= 0.90 for six of eight type-encoder pairs; (T2) regime is characterized by two surface-text predicates, with P1 decisive for routing and P2 qualifying the B-dominant case, holding across three encoders and three datasets; and (T3) bridge advantage requires the relation-bearing sentence, not entity name alone, with removal causing an 8.6-14.1 pp performance drop (p < 0.001). Building on this theory, we propose RegimeRouter, a lightweight binary router that selects between question-only and question-plus-relation-sentence retrieval using five text features derived directly from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
