DesignCLIP: Multimodal Learning with CLIP for Design Patent Understanding

Zhu Wang; Homaira Huda Shomee; Sathya N. Ravi; Sourav Medya

arXiv:2508.15297·cs.CV·August 22, 2025

DesignCLIP: Multimodal Learning with CLIP for Design Patent Understanding

Zhu Wang, Homaira Huda Shomee, Sathya N. Ravi, Sourav Medya

PDF

Open Access 1 Video

TL;DR

DesignCLIP leverages CLIP-based multimodal learning to improve design patent classification and retrieval by incorporating detailed captions and multi-view image analysis, outperforming existing models.

Contribution

This work introduces DesignCLIP, a novel multimodal framework using CLIP for design patent understanding, with class-aware classification and contrastive learning tailored for patent data.

Findings

01

Outperforms baseline and SOTA models in patent tasks

02

Effective in patent classification and retrieval

03

Enhances multimodal patent analysis

Abstract

In the field of design patent analysis, traditional tasks such as patent classification and patent image retrieval heavily depend on the image data. However, patent images -- typically consisting of sketches with abstract and structural elements of an invention -- often fall short in conveying comprehensive visual context and semantic information. This inadequacy can lead to ambiguities in evaluation during prior art searches. Recent advancements in vision-language models, such as CLIP, offer promising opportunities for more reliable and accurate AI-driven patent analysis. In this work, we leverage CLIP models to develop a unified framework DesignCLIP for design patent applications with a large-scale dataset of U.S. design patents. To address the unique characteristics of patent data, DesignCLIP incorporates class-aware classification and contrastive learning, utilizing generated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

DesignCLIP: Multimodal Learning with CLIP for Design Patent Understanding· underline

Taxonomy

TopicsIntellectual Property and Patents · Machine Learning in Materials Science · Advanced Graph Neural Networks