# Automating the search for a patent's prior art with a full text   similarity search

**Authors:** Lea Helmers, Franziska Horn, Franziska Biegler, Tim Oppermann,, Klaus-Robert M\"uller

arXiv: 1901.03136 · 2019-03-06

## TL;DR

This paper presents an automated method using machine learning and NLP to improve and speed up the patent prior art search process by comparing full texts of patents.

## Contribution

It introduces a novel full-text similarity search approach for patents, outperforming traditional keyword-based methods in both speed and quality.

## Key findings

- Automated approach accelerates prior art search process.
- Improves the relevance and quality of search results.
- Evaluation shows better performance compared to existing methods.

## Abstract

More than ever, technical inventions are the symbol of our society's advance. Patents guarantee their creators protection against infringement. For an invention being patentable, its novelty and inventiveness have to be assessed. Therefore, a search for published work that describes similar inventions to a given patent application needs to be performed. Currently, this so-called search for prior art is executed with semi-automatically composed keyword queries, which is not only time consuming, but also prone to errors. In particular, errors may systematically arise by the fact that different keywords for the same technical concepts may exist across disciplines. In this paper, a novel approach is proposed, where the full text of a given patent application is compared to existing patents using machine learning and natural language processing techniques to automatically detect inventions that are similar to the one described in the submitted document. Various state-of-the-art approaches for feature extraction and document comparison are evaluated. In addition to that, the quality of the current search process is assessed based on ratings of a domain expert. The evaluation results show that our automated approach, besides accelerating the search process, also improves the search results for prior art with respect to their quality.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.03136/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1901.03136/full.md

## References

73 references — full list in the complete paper: https://tomesphere.com/paper/1901.03136/full.md

---
Source: https://tomesphere.com/paper/1901.03136