# KCS: Diversify Multi-hop Question Generation with Knowledge Composition Sampling

**Authors:** Yangfan Wang, Jie Liu, Chen Tang, Lian Yan, Jingchi Jiang

arXiv: 2508.20567 · 2026-04-14

## TL;DR

This paper presents KCS, a novel framework for diversifying multi-hop question generation by sampling knowledge compositions, improving question diversity and accuracy in multi-hop QA datasets.

## Contribution

KCS introduces a probabilistic contrastive loss and stochastic decoding to enhance knowledge integration and question diversity in multi-hop question generation.

## Key findings

- KCS improves knowledge composition selection accuracy by 3.9%.
- Application of KCS enhances performance on HotpotQA and 2WikiMultihopQA datasets.
- KCS effectively balances accuracy and diversity in question generation.

## Abstract

Multi-hop question answering faces substantial challenges due to data sparsity, which increases the likelihood of language models learning spurious patterns. To address this issue, prior research has focused on diversifying question generation through content planning and varied expression. However, these approaches often emphasize generating simple questions and neglect the integration of essential knowledge, such as relevant sentences within documents. This paper introduces the Knowledge Composition Sampling (KCS), an innovative framework designed to expand the diversity of generated multi-hop questions by sampling varied knowledge compositions within a given context. KCS models the knowledge composition selection as a sentence-level conditional prediction task and utilizes a probabilistic contrastive loss to predict the next most relevant piece of knowledge. During inference, we employ a stochastic decoding strategy to effectively balance accuracy and diversity. Compared to competitive baselines, our KCS improves the overall accuracy of knowledge composition selection by 3.9%, and its application for data augmentation yields improvements on HotpotQA and 2WikiMultihopQA datasets. Our code is available at: https://github.com/yangfanww/kcs.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.20567/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/2508.20567/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/2508.20567/full.md

---
Source: https://tomesphere.com/paper/2508.20567