Enhancing patent retrieval using automated patent summarization
Eleni Kamateri, Renukswamy Chikkamath, Michail Salampasis, Linda Andersson, and Markus Endres

TL;DR
This paper explores using extractive and abstractive summarization techniques to generate concise patent summaries, which serve as effective surrogate queries, significantly improving patent retrieval performance over traditional methods.
Contribution
It introduces the application of recent summarization methods to patent retrieval, demonstrating their effectiveness as surrogate queries in improving retrieval accuracy.
Findings
Summarization-based queries outperform traditional methods in retrieval effectiveness.
Automated summaries reduce query length while maintaining or improving accuracy.
Summarization methods are effective across multiple patent datasets.
Abstract
Effective query formulation is a key challenge in long-document Information Retrieval (IR). This challenge is particularly acute in domain-specific contexts like patent retrieval, where documents are lengthy, linguistically complex, and encompass multiple interrelated technical topics. In this work, we present the application of recent extractive and abstractive summarization methods for generating concise, purpose-specific summaries of patent documents. We further assess the utility of these automatically generated summaries as surrogate queries across three benchmark patent datasets and compare their retrieval performance against conventional approaches that use entire patent sections. Experimental results show that summarization-based queries significantly improve prior-art retrieval effectiveness, highlighting their potential as an efficient alternative to traditional query…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntellectual Property and Patents · Biomedical Text Mining and Ontologies · Advanced Text Analysis Techniques
