Is It Novel and Why? Fine-Grained Patent Novelty Prediction Based on Passage Retrieval

Valentin Knappich; Anna H\"atty; Simon Razniewski; Annemarie Friedrich

arXiv:2605.02392·cs.CL·May 5, 2026

Is It Novel and Why? Fine-Grained Patent Novelty Prediction Based on Passage Retrieval

Valentin Knappich, Anna H\"atty, Simon Razniewski, Annemarie Friedrich

PDF

1 Repo

TL;DR

This paper introduces FiNE-Patents, a dataset and method for fine-grained, feature-level patent novelty prediction using large language models, improving interpretability and robustness over traditional binary classification.

Contribution

The work presents a new dataset and a shift from claim-level binary classification to feature-level passage retrieval and reasoning for patent novelty assessment.

Findings

01

LLMs outperform embedding-based baselines in passage retrieval.

02

LLMs are more robust against spurious correlations in novelty prediction.

03

The dataset and code are publicly released for further research.

Abstract

Novelty assessment is a critical yet complex task in the examination process for patent acceptance, requiring examiners to determine whether an invention is disclosed in a prior art document. The process involves intricate matching between specific features of a patent claim and passages in the prior art. While prior work has approached novelty prediction primarily as a binary classification task at the claim level, we argue that this formulation is susceptible to spurious correlations and lacks the granularity required for practical application. In this work, we introduce FiNE-Patents (Fine-grained Novelty Examination of Patents), a novel dataset comprising 3,658 first patent claims annotated with fine-grained, feature-level prior art references extracted from European Search Opinion (ESOP) documents. We propose shifting the evaluation paradigm from simple binary classification to a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.