Controllable Data Generation Via Iterative Data-Property Mutual Mappings
Bo Pan, Muran Qin, Shiyu Wang, Yifei Zhang, Liang Zhao

TL;DR
This paper introduces a semi-supervised framework for controllable data generation using VAEs, enabling precise property control, disentanglement, and out-of-distribution property management with efficient training.
Contribution
It proposes a novel iterative mutual mapping approach to enhance VAE-based generators for controllability and disentanglement, applicable to both seen and unseen data.
Findings
Improved property control accuracy
Enhanced disentanglement of latent variables
Faster training convergence
Abstract
Deep generative models have been widely used for their ability to generate realistic data samples in various areas, such as images, molecules, text, and speech. One major goal of data generation is controllability, namely to generate new data with desired properties. Despite growing interest in the area of controllable generation, significant challenges still remain, including 1) disentangling desired properties with unrelated latent variables, 2) out-of-distribution property control, and 3) objective optimization for out-of-distribution property control. To address these challenges, in this paper, we propose a general framework to enhance VAE-based data generators with property controllability and ensure disentanglement. Our proposed objective can be optimized on both data seen and unseen in the training set. We propose a training procedure to train the objective in a semi-supervised…
Peer Reviews
Decision·ICLR 2024 Conference Withdrawn Submission
Disentanglement within the framework appears to be somewhat effective, demonstrating the model's ability to separate and control specific properties or features in the generated data in certain instances. However, the degree to which this disentanglement consistently holds or the specific conditions under which it succeeds remain topics of interest for further investigation. Despite the framework's potential, it is notably challenging to identify and pinpoint any consistently meaningful strengt
The significance of this work becomes apparent when considering it in the context of state-of-the-art generative models. There is a clear need for this investigation to ascertain how the proposed framework contributes to the field and whether it offers substantial improvements or unique capabilities compared to existing models. The presentation of the work, unfortunately, poses challenges in terms of its clarity and comprehensibility. It is evident that the way the research findings and methodo
Strengths : 1. The concept of generating output with out-of-distribution properties has a broad range of applications and can be utilized to create previously unseen data for various downstream tasks. 2. The paper is well-written and easy to follow. It provides a thorough assessment of the limitations in prior works and demonstrates how the proposed method tackles these shortcomings. 3. The authors introduced a novel loss function to guarantee disentanglement among the desired properties.
Major Comments : 1. The authors have proposed generating the latent vector (w) associated with a given property by utilizing both the sampled latent factor (z) and the desired property (y). However, the rationale for incorporating (z) in this mapping function remains unclear, particularly given that prior works[1] in the field have only used y. It is important for the authors to provide an explanation for why both z and y are considered in the mapping function and elucidate their impact on the
S1. **Innovative Approach:** The framework enhances VAE-based generators with better property controllability and ensures superior disentanglement, offering a new perspective on data generation. S2. **Versatility:** The proposed framework is shown to be applicable in different VAE models and multiple domains, from images to molecules, suggesting its broad utility. S3. **Supporting Out-of-Distribution Property Control:** Through designing new objective functions and optimization strategies, t
W1. **Increased Complexity:** The proposed approach, while comprehensive, seems to add multiple components and constraints to the training process. This complexity might make it difficult for practitioners to adapt and integrate the framework into existing pipelines. Furthermore, the merger of control parameters with generative models could complicate model design, training, and deployment. W2. **Scalability Concerns:** The iterative training procedure, while innovative, might raise questions
- This paper overally well-written and easy-to-follow. Figure 2 provide much information for understanding the proposed overall framework. - The proposed method has a plug-in-play property for any VAE framework. - The experimental results have shown the proposed method consistently improves various VAE frameworks.
- This paper has a severe problem for deriving the training objective of $\mathcal{L}_2$. Here, $\mathcal{L}_2 = \mathbb{E}_q[-\log p(y|x)] = \int - q(z, w|x, y) \log p(y|x) dz dw = - \log p(Y=y|X=x)$, where $x, y \in \mathcal{X}_1, \mathcal{Y}_1 $ are real samples but not generated one. This is because $-\log p(Y=y|X=x)$ This term is then just constant since it does not depend on any parameters $\theta, \phi, \gamma$. - For the above reason, the proposed generative model does not maximize the
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · AI in cancer detection · Machine Learning and Data Classification
