Word Embeddings for Chemical Patent Natural Language Processing

Camilo Thorne; Saber Akhondi

arXiv:2010.12912·cs.CL·October 27, 2020·1 cites

Word Embeddings for Chemical Patent Natural Language Processing

Camilo Thorne, Saber Akhondi

PDF

Open Access

TL;DR

This paper evaluates chemical patent word embeddings, demonstrating they outperform biomedical embeddings both intrinsically and extrinsically, and shows that contextualized embeddings can build effective predictive models in this domain.

Contribution

It introduces chemical patent-specific word embeddings and assesses their performance, highlighting their superiority over biomedical embeddings and the effectiveness of contextualized embeddings.

Findings

01

Chemical patent embeddings outperform biomedical embeddings.

02

Contextualized embeddings enable effective predictive modeling.

03

Embeddings show strong intrinsic and extrinsic performance.

Abstract

We evaluate chemical patent word embeddings against known biomedical embeddings and show that they outperform the latter extrinsically and intrinsically. We also show that using contextualized embeddings can induce predictive models of reasonable performance for this domain over a relatively small gold standard.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Intellectual Property and Patents · Computational Drug Discovery Methods