TL;DR
This paper introduces FiNE-Patents, a dataset and method for fine-grained, feature-level patent novelty prediction using large language models, improving interpretability and robustness over traditional binary classification.
Contribution
The work presents a new dataset and a shift from claim-level binary classification to feature-level passage retrieval and reasoning for patent novelty assessment.
Findings
LLMs outperform embedding-based baselines in passage retrieval.
LLMs are more robust against spurious correlations in novelty prediction.
The dataset and code are publicly released for further research.
Abstract
Novelty assessment is a critical yet complex task in the examination process for patent acceptance, requiring examiners to determine whether an invention is disclosed in a prior art document. The process involves intricate matching between specific features of a patent claim and passages in the prior art. While prior work has approached novelty prediction primarily as a binary classification task at the claim level, we argue that this formulation is susceptible to spurious correlations and lacks the granularity required for practical application. In this work, we introduce FiNE-Patents (Fine-grained Novelty Examination of Patents), a novel dataset comprising 3,658 first patent claims annotated with fine-grained, feature-level prior art references extracted from European Search Opinion (ESOP) documents. We propose shifting the evaluation paradigm from simple binary classification to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
