Example of Example of Photorealistic Human Faces Generated by a GAN (source: T. Karras 2018
This blog series is still under development.
What is Generative Adversarial Nets (GANs)?
Generative adversarial nets (GANs) are an emergent class of deep learning algorithms that generate incredibly realistic images (Brownlee 2019). For example, we can use GANs to generate pictures of people that had never existed (Karras, et al. 2017, or make someone look younger or older (Antipov, Baccouche and Dugelay 2017). We can also use a GANs to take a low-res video and turn it into a high- res video (Acharya, et al. 2018). GANs are on the path to transform image editing and more broadly, media and entertainment. In recent researches, GANs have been widely adopted in various I2I translation task and have generated many interesting application.
About This Series: I2I translation
In this series of Image-to-Image Translation, I am going to explore works in Image-to-image (I2I) translation problem such as GANs, VAE and Cycle Consistency, and eleveate three proposed frameworks which have laid the groundwork for current direction of developments, they are:
- CGAN (Isola, et al. 2018) introduces a generic framework on I2I translation and extends GANs to condition on an input image and try to output a corresponding output image;
- The UNIT framework (Liu, Breuel and Kautz 2018)proposed a unsupervised framework via a VAE-GAN which takes a shared latent space consumption and weight sharing in networks across two domains.
- StarGAN (Choi, et al. 2018) extends the previous work into multi-domain I2I translation by introducing a shared generator for the sake of a scalable and inter- convertible translation.
I hope this series shall share what I have lerant in my self stude of a Imagess Processing course and share with you the working principle of the proposed frameworks and compare the objective functions of them. I will first get into the related background research that support the study of the related papers. Next I shall focus more on the on the objective function and the architecture of the frameworks. After preliminary study, GANs and VAE are key underlying frameworks in their research. It is crucial to understand the background which forms the ground of the later research and provide the source of inspiration of the proposed framework. The design of objective function is believed to be the key in accomplishing the targeted results.
All Computer Vision Problems are Indeed I2I translation problems?
I2I translation problem is to convert images from a source domain to a targeted domain. Many computer vision problems can be posed as an I2I translation problem. For example, super-resolution is mapping a low res-to high res domain, image segmentation is mapping realistic photos to colour segments (Qui, et al. 2020), colorization is mapping grey-scale to colour image domain (Fu, Hsu and Yang 2017). Changing the species of cat from one to another. In fact, some I2I frameworks such as pix2pix proposed in CGAN have been available off the shelf in modern machine learning toolchain (Pix2Pix | TensorFlow Core n.d.)
Leave a Reply