Word Embeddings for Chemical Patent Natural Language Processing
Camilo Thorne, Saber Akhondi

TL;DR
This paper evaluates chemical patent word embeddings, demonstrating they outperform biomedical embeddings both intrinsically and extrinsically, and shows that contextualized embeddings can build effective predictive models in this domain.
Contribution
It introduces chemical patent-specific word embeddings and assesses their performance, highlighting their superiority over biomedical embeddings and the effectiveness of contextualized embeddings.
Findings
Chemical patent embeddings outperform biomedical embeddings.
Contextualized embeddings enable effective predictive modeling.
Embeddings show strong intrinsic and extrinsic performance.
Abstract
We evaluate chemical patent word embeddings against known biomedical embeddings and show that they outperform the latter extrinsically and intrinsically. We also show that using contextualized embeddings can induce predictive models of reasonable performance for this domain over a relatively small gold standard.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Intellectual Property and Patents · Computational Drug Discovery Methods
