CycleGAN

Paper

https://arxiv.org/abs/1703.10593

Introducation

Both Pixel2pixel and CycleGAN are models to transfer one class of images to the onther class. For example, transfering horse to zebra, coloring black/white photo.

Pixel2pixelhas a major drawback:

  • The images from two classes must be one-to-one paired.
  • The user need to pair them by themself.
  • Sometime, it’s simple - transfer color photos to grey photos.
  • But, in most situations, it’s not possible - no way to pair horse and zebra photos.

CycleGAN is the upgraded version of pixel2pixel model. CycleGAN does not need to pair images of two classes, its input is just two set of unpaired images of two classes.

What it can/cannot do

What the CycleGAN/Pixel2pixel does is that tranfer the texture of object to the other class, and keep the object in the same shape of the original object.

  • So, you can tranfer horse to zebra (same shape but different texture)
  • But, you cannont tranfer cat to dog (same texture but different shape)

Ideas

The idea is easy to understand.

  • There are two classes of images, $A$ and $B$.
  • We create two generators $G_A$ and $G_B$. $G_A$ will transfer $A$ to $B$, and similarly, $G_B: B \rightarrow A$.
  • And also, we have two discriminators $D_A$ and $D_B$. $D_A/D_B$ can check if the input image is a real image of $A/B$ or not.

The trainning target for $D_A$ and $D_B$:

  • $D_A(a) = 1.0, D_B(b) = 1.0$
  • $D_A(G_B(b)) = 0.0, D_B(G_A(a)) = 0.0$

The trainning target for $G_A$ and $G_B$:

  • $D_A(G_B(b)) = 1.0, D_B(G_A(a)) = 1.0$
  • $G_A(b) = b, G_B(a) = a$.
  • $G_B(G_A(a)) = a, G_A(G_B(b)) = b$

The rest parts are picking which loss function for each target, and whhich optimitier you want to use.

Code

my mini impl