Content-Based Search for Deep Generative Models

Daohan Lu; Sheng-Yu Wang; Nupur Kumari; Rohan Agarwal; Mia Tang; David; Bau; Jun-Yan Zhu

arXiv:2210.03116·cs.CV·October 25, 2023

Content-Based Search for Deep Generative Models

Daohan Lu, Sheng-Yu Wang, Nupur Kumari, Rohan Agarwal, Mia Tang, David, Bau, Jun-Yan Zhu

PDF

1 Repo

TL;DR

This paper introduces a content-based search method for deep generative models, enabling users to find models matching a query across different modalities by formulating an optimization problem and employing contrastive learning, outperforming baselines on a new benchmark.

Contribution

The paper proposes a novel content-based model search framework for generative models, including a probabilistic formulation and a contrastive learning approach for multi-modal queries.

Findings

01

Outperforms baseline methods on the Generative Model Zoo benchmark.

02

Effective retrieval across image, sketch, and text modalities.

03

Introduces a new benchmark dataset for model retrieval tasks.

Abstract

The growing proliferation of customized and pretrained generative models has made it infeasible for a user to be fully cognizant of every model in existence. To address this need, we introduce the task of content-based model search: given a query and a large set of generative models, finding the models that best match the query. As each generative model produces a distribution of images, we formulate the search task as an optimization problem to select the model with the highest probability of generating similar content as the query. We introduce a formulation to approximate this probability given the query from different modalities, e.g., image, sketch, and text. Furthermore, we propose a contrastive learning framework for model retrieval, which learns to adapt features for various query modalities. We demonstrate that our method outperforms several baselines on Generative Model Zoo, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

generative-intelligence-lab/modelverse
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings