Semantic Bridge: Universal Multi-Hop Question Generation via AMR-Driven Graph Synthesis
Linqing Chen, Hanmeng Zhong, Wentao Wu, Weilei Wang

TL;DR
Semantic Bridge introduces a universal framework leveraging AMR-driven graph synthesis to controllably generate complex multi-hop reasoning questions from diverse sources, significantly enhancing training data for large language models.
Contribution
It presents the first universal, controllable multi-hop question generation framework using semantic graph weaving and AMR analysis, applicable across domains and languages.
Findings
Achieves up to 9.5% better round-trip quality in question generation.
Outperforms baselines with 18.3%-25.4% gains across datasets and languages.
Generated questions outperform human annotations with fewer resources.
Abstract
Large language model (LLM) training faces a critical bottleneck: the scarcity of high-quality, reasoning-intensive question-answer pairs, especially from sparse, domain-specific sources like PubMed papers or legal documents. Existing methods rely on surface patterns, fundamentally failing to generate controllable, complex multi-hop reasoning questions that test genuine understanding-essential for advancing LLM training paradigms. We present \textbf{Semantic Bridge}, the first universal framework for controllably generating sophisticated multi-hop reasoning questions from arbitrary sources. Our breakthrough innovation is \textit{semantic graph weaving}-three complementary bridging mechanisms (entity bridging for role-varying shared entities, predicate chain bridging for temporal/causal/logical sequences, and causal bridging for explicit reasoning chains)-that systematically construct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
