Controlling the Latent Space of GANs through Reinforcement Learning: A   Case Study on Task-based Image-to-Image Translation

Mahyar Abbasian; Taha Rajabzadeh; Ahmadreza Moradipari; Seyed Amir; Hossein Aqajari; Hongsheng Lu; Amir Rahmani

arXiv:2307.13978·cs.LG·July 27, 2023

Controlling the Latent Space of GANs through Reinforcement Learning: A Case Study on Task-based Image-to-Image Translation

Mahyar Abbasian, Taha Rajabzadeh, Ahmadreza Moradipari, Seyed Amir, Hossein Aqajari, Hongsheng Lu, Amir Rahmani

PDF

Open Access

TL;DR

This paper introduces a novel method that combines reinforcement learning with GANs to control the generation process, demonstrated through task-based image-to-image translation on the MNIST dataset.

Contribution

It presents the first integration of an RL agent with a GAN to enable targeted output generation, advancing control over generative models.

Findings

01

RL agent successfully navigates the latent space for desired outputs

02

Method effectively performs task-based image translation on MNIST

03

Demonstrates potential for improved control in generative networks

Abstract

Generative Adversarial Networks (GAN) have emerged as a formidable AI tool to generate realistic outputs based on training datasets. However, the challenge of exerting control over the generation process of GANs remains a significant hurdle. In this paper, we propose a novel methodology to address this issue by integrating a reinforcement learning (RL) agent with a latent-space GAN (l-GAN), thereby facilitating the generation of desired outputs. More specifically, we have developed an actor-critic RL agent with a meticulously designed reward policy, enabling it to acquire proficiency in navigating the latent space of the l-GAN and generating outputs based on specified tasks. To substantiate the efficacy of our approach, we have conducted a series of experiments employing the MNIST dataset, including arithmetic addition as an illustrative task. The outcomes of these experiments serve to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning