Implicitly Guided Design with PropEn: Match your Data to Follow the   Gradient

Nata\v{s}a Tagasovska; Vladimir Gligorijevi\'c; Kyunghyun Cho; Andreas; Loukas

arXiv:2405.18075·cs.LG·May 29, 2024

Implicitly Guided Design with PropEn: Match your Data to Follow the Gradient

Nata\v{s}a Tagasovska, Vladimir Gligorijevi\'c, Kyunghyun Cho, Andreas, Loukas

PDF

Open Access 1 Video

TL;DR

PropEn is a novel framework that uses implicit guidance through sample matching to optimize models efficiently in data-scarce scientific domains, avoiding the need for large datasets and discriminator training.

Contribution

It introduces PropEn, a domain-agnostic, matching-based generative framework that approximates gradients for property optimization without requiring extensive data or discriminator models.

Findings

01

PropEn effectively guides design optimization in scientific applications.

02

It outperforms baseline methods in toy and real-world tasks.

03

Protein design results are validated with wet lab experiments.

Abstract

Across scientific domains, generating new models or optimizing existing ones while meeting specific criteria is crucial. Traditional machine learning frameworks for guided design use a generative model and a surrogate model (discriminator), requiring large datasets. However, real-world scientific applications often have limited data and complex landscapes, making data-hungry models inefficient or impractical. We propose a new framework, PropEn, inspired by ``matching'', which enables implicit guidance without training a discriminator. By matching each sample with a similar one that has a better property value, we create a larger training dataset that inherently indicates the direction of improvement. Matching, combined with an encoder-decoder architecture, forms a domain-agnostic generative framework for property enhancement. We show that training with a matched dataset approximates the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Implicitly Guided Design with PropEn: Match your Data to Follow the Gradient· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Manufacturing Process and Optimization