Introduction of a novel word embedding approach based on technology   labels extracted from patent data

Mark Standke; Abdullah Kiwan; Annalena Lange; Silvan Berg

arXiv:2102.00425·cs.CL·September 4, 2024

Introduction of a novel word embedding approach based on technology labels extracted from patent data

Mark Standke, Abdullah Kiwan, Annalena Lange, Silvan Berg

PDF

Open Access

TL;DR

This paper introduces a new word embedding method leveraging patent technology labels to generate accurate, language-independent vectors for technical terms, addressing the challenge of diverse patent language.

Contribution

It presents a novel statistical analysis-based word embedding approach specifically designed for patent terminology, improving synonym detection in patent searches.

Findings

01

Qualitative results demonstrate the effectiveness of the approach.

02

The method is an extension of EQMania's previous work.

03

Algorithm can be tested online until April 2021.

Abstract

Diversity in patent language is growing and makes finding synonyms for conducting patent searches more and more challenging. In addition to that, most approaches for dealing with diverse patent language are based on manual search and human intuition. In this paper, a word embedding approach using statistical analysis of human labeled data to produce accurate and language independent word vectors for technical terms is introduced. This paper focuses on the explanation of the idea behind the statistical analysis and shows first qualitative results. The resulting algorithm is a development of the former EQMania UG (eqmania.com) and can be tested under eqalice.com until April 2021.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntellectual Property and Patents · Biomedical Text Mining and Ontologies