Patent Sentiment Analysis to Highlight Patent Paragraphs
Renukswamy Chikkamath, Vishvapalsinhji Ramsinh Parmar, Christoph, Hewel, and Markus Endres

TL;DR
This paper introduces a large dataset and baseline machine learning models to automate patent paragraph highlighting, aiming to improve efficiency in patent analysis by semantic annotation.
Contribution
The work presents a novel 150k-sample dataset, baseline models, and open-source tools for automated patent paragraph highlighting, with future plans for deep learning enhancements.
Findings
Developed a 150k-sample patent dataset
Created baseline machine learning models for highlighting
Open-sourced dataset and code for community use
Abstract
Given a patent document, identifying distinct semantic annotations is an interesting research aspect. Text annotation helps the patent practitioners such as examiners and patent attorneys to quickly identify the key arguments of any invention, successively providing a timely marking of a patent text. In the process of manual patent analysis, to attain better readability, recognising the semantic information by marking paragraphs is in practice. This semantic annotation process is laborious and time-consuming. To alleviate such a problem, we proposed a novel dataset to train Machine Learning algorithms to automate the highlighting process. The contributions of this work are: i) we developed a multi-class, novel dataset of size 150k samples by traversing USPTO patents over a decade, ii) articulated statistics and distributions of data using imperative exploratory data analysis, iii)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Intellectual Property and Patents · Biomedical Text Mining and Ontologies
