Why These Documents? Explainable Generative Retrieval with Hierarchical Category Paths

Sangam Lee; Ryang Heo; SeongKu Kang; Susik Yoon; Jinyoung Yeo; Dongha Lee

arXiv:2411.05572·cs.IR·April 14, 2026

Why These Documents? Explainable Generative Retrieval with Hierarchical Category Paths

Sangam Lee, Ryang Heo, SeongKu Kang, Susik Yoon, Jinyoung Yeo, Dongha Lee

PDF

1 Repo

TL;DR

HyPE introduces a hierarchical category path-based generative retrieval method that enhances explainability and improves retrieval performance by generating semantic category paths before decoding document identifiers.

Contribution

The paper proposes HyPE, a novel hierarchical category path-enhanced generative retrieval approach that provides explanations and boosts retrieval accuracy.

Findings

01

HyPE achieves high explainability in document retrieval.

02

HyPE improves retrieval performance over baseline models.

03

HyPE effectively utilizes external semantic hierarchies for training.

Abstract

Generative retrieval directly decode a document identifier (i.e., docid) in response to a query, making it impossible to provide users with explanations as an answer for ``why is this document retrieved?''. To address this limitation, we propose Hierarchical Category Path-Enhanced Generative Retrieval (HyPE), which enhances explainability by first generating hierarchical category paths step-by-step then decoding docid. By leveraging hierarchical category paths which progress from broader to more specific semantic categories, HyPE can provide detailed explanation for its retrieval decision. For training, HyPE constructs category paths with external high-quality semantic hierarchy, leverages LLM to select appropriate candidate paths for each document, and optimizes the generative retrieval model with path-augmented dataset. During inference, HyPE utilizes path-aware ranking strategy to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

augustinlib/hype-why-these-documents
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.