Patent Representation Learning via Self-supervision
You Zuo (ALMAnaCH), Kim Gerdes (LISN), Eric Villemonte de La Clergerie (ALMAnaCH), Beno\^it Sagot (ALMAnaCH)

TL;DR
This paper introduces a self-supervised contrastive learning framework for patent embeddings that leverages intra-document section-based views, improving retrieval and classification without relying on annotations.
Contribution
It proposes section-based augmentation for patent embedding learning, addressing limitations of dropout methods and exploiting patent structure for better representations.
Findings
Outperforms citation-and IPC-supervised baselines in retrieval and classification.
Section-specific embeddings improve task performance, with claims aiding retrieval and background aiding classification.
The method is scalable and avoids reliance on brittle annotations.
Abstract
This paper presents a simple yet effective contrastive learning framework for learning patent embeddings by leveraging multiple views from within the same document. We first identify a patent-specific failure mode of SimCSE style dropout augmentation: it produces overly uniform embeddings that lose semantic cohesion. To remedy this, we propose section-based augmentation, where different sections of a patent (e.g., abstract, claims, background) serve as complementary views. This design introduces natural semantic and structural diversity, mitigating over-dispersion and yielding embeddings that better preserve both global structure and local continuity. On large-scale benchmarks, our fully self-supervised method matches or surpasses citation-and IPC-supervised baselines in prior-art retrieval and classification, while avoiding reliance on brittle or incomplete annotations. Our analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗ZoeYou/PatentMap-V0-SecPair-Claimmodel· 450 dl· ♡ 4450 dl♡ 4
- 🤗ZoeYou/patentmapv0-modelsmodel· ♡ 2♡ 2
- 🤗ZoeYou/PatentMap-V0-SecPair-Summarymodel
- 🤗ZoeYou/PatentMap-V0-SecPair-Backgroundmodel
- 🤗ZoeYou/PatentMap-V0-SecPair-Drawingmodel· 3 dl3 dl
- 🤗ZoeYou/PatentMap-V0-SecPair-Descriptionmodel
- 🤗ZoeYou/PatentMap-V0-SecPair-ClaimSummarymodel
- 🤗ZoeYou/PatentMap-V0-SecPair-ClaimBackgroundmodel
- 🤗ZoeYou/PatentMap-V0-SecPair-ClaimDrawingmodel
- 🤗ZoeYou/PatentMap-V0-SecPair-ClaimDescriptionmodel· 3 dl3 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntellectual Property and Patents · Advanced Graph Neural Networks · Machine Learning in Materials Science
