Natural Language Processing in the Patent Domain: A Survey
Lekang Jiang, Stephan Goetz

TL;DR
This survey reviews NLP techniques, especially large language models, applied to patent documents, highlighting their unique challenges and potential for analysis and generation tasks in the patent domain.
Contribution
It provides a comprehensive overview of patent-specific NLP challenges, tasks, and the application of LLMs, serving as a guide for future research in this specialized area.
Findings
Identifies key structural and linguistic features of patents.
Maps NLP tasks to patent analysis and generation.
Highlights under-explored potential of LLMs in patents.
Abstract
Patents, which encapsulate crucial technical and legal information in text form and referenced drawings, present a rich domain for natural language processing (NLP) applications. As NLP technologies evolve, large language models (LLMs) have demonstrated outstanding capabilities in general text processing and generation tasks. However, the application of LLMs in the patent domain remains under-explored and under-developed due to the complexity of patents, particularly their language and legal framework. Understanding the unique characteristics of patent documents and related research in the patent domain becomes essential for researchers to apply these tools effectively. Therefore, this paper aims to equip NLP researchers with the essential knowledge to navigate this complex domain efficiently. We introduce the relevant fundamental aspects of patents to provide solid background…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntellectual Property and Patents
MethodsFocus
