TL;DR
This paper introduces PatentSBERTa, a hybrid deep NLP model using SBERT embeddings for patent similarity measurement and classification, achieving high accuracy and F1 scores in multi-label patent classification tasks.
Contribution
The study presents a novel hybrid framework combining SBERT-based embeddings with a simple KNN classifier for efficient patent similarity assessment and classification, outperforming existing methods.
Findings
Achieved 54% accuracy and >66% F1 score in multi-label patent classification.
Validated that p2p similarity effectively captures technological features.
Demonstrated the framework's usefulness for semantic patent search and technology intelligence.
Abstract
This study provides an efficient approach for using text data to calculate patent-to-patent (p2p) technological similarity, and presents a hybrid framework for leveraging the resulting p2p similarity for applications such as semantic search and automated patent classification. We create embeddings using Sentence-BERT (SBERT) based on patent claims. We leverage SBERTs efficiency in creating embedding distance measures to map p2p similarity in large sets of patent data. We deploy our framework for classification with a simple Nearest Neighbors (KNN) model that predicts Cooperative Patent Classification (CPC) of a patent based on the class assignment of the K patents with the highest p2p similarity. We thereby validate that the p2p similarity captures their technological features in terms of CPC overlap, and at the same demonstrate the usefulness of this approach for automatic patent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · WordPiece · Linear Warmup With Linear Decay · Attention Dropout · Byte Pair Encoding · BERT
