PEDANTIC: A Dataset for the Automatic Examination of Definiteness in Patent Claims
Valentin Knappich, Annemarie Friedrich, Anna H\"atty, Simon Razniewski

TL;DR
PEDANTIC is a new dataset of 14,000 US patent claims annotated with reasons for indefiniteness, created using an automated pipeline with LLMs, to improve automatic patent definiteness examination.
Contribution
The paper introduces PEDANTIC, the first large-scale annotated dataset for patent definiteness, generated via an automated pipeline with validation, facilitating research in automatic patent examination.
Findings
LLMs often struggle to outperform simple models in definiteness prediction.
The pipeline achieves high-quality annotations validated by human study.
LLMs can identify reasons for indefiniteness but not always improve prediction accuracy.
Abstract
Patent claims define the scope of protection for an invention. If there are ambiguities in a claim, it is rejected by the patent office. In the US, this is referred to as indefiniteness (35 U.S.C {\S} 112(b)) and is among the most frequent reasons for patent application rejection. The development of automatic methods for patent definiteness examination has the potential to make patent drafting and examination more efficient, but no annotated dataset has been published to date. We introduce PEDANTIC (Patent Definiteness Examination Corpus), a novel dataset of 14k US patent claims from patent applications relating to Natural Language Processing (NLP), annotated with reasons for indefiniteness. We construct PEDANTIC using a fully automatic pipeline that retrieves office action documents from the USPTO and uses Large Language Models (LLMs) to extract the reasons for indefiniteness. A human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntellectual Property and Patents · Explainable Artificial Intelligence (XAI) · Law, AI, and Intellectual Property
