Generative Retrieval for Book search
Yubao Tang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Shihao Liu,, Shuaiqing Wang, Dawei Yin, Xueqi Cheng

TL;DR
This paper introduces GBS, a generative retrieval framework for book search that leverages data augmentation and outline-oriented encoding to improve retrieval accuracy on complex, hierarchical book data.
Contribution
The paper proposes a novel GBS framework with data augmentation and outline-oriented encoding to enhance generative retrieval for books.
Findings
GBS achieves 9.8% higher MRR@20 than baseline methods.
Outline-oriented encoding improves handling of long, hierarchical book information.
Data augmentation strategies enhance model training with diverse pseudo-queries.
Abstract
In book search, relevant book information should be returned in response to a query. Books contain complex, multi-faceted information such as metadata, outlines, and main text, where the outline provides hierarchical information between chapters and sections. Generative retrieval (GR) is a new retrieval paradigm that consolidates corpus information into a single model to generate identifiers of documents that are relevant to a given query. How can GR be applied to book search? Directly applying GR to book search is a challenge due to the unique characteristics of book search: The model needs to retain the complex, multi-faceted information of the book, which increases the demand for labeled data. Splitting book information and treating it as a collection of separate segments for learning might result in a loss of hierarchical information. We propose an effective Generative retrieval…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Semantic Web and Ontologies · Natural Language Processing Techniques
MethodsSoftmax · Attention Is All You Need
