Lecture 22 | (1/2) Generative Ad
https://www.youtube.com/watch?v=1irBjosv_kE&list=PLp-0K3kfddPzNdZPX4p0lVi6AcDXBofuf&index=22
(old)
change the facial expressions; understand the img
1 img compared with another img -> L1/L2 loss ...
100 imgs compared with another 100 imgs -> discrimator
it's differentiable
from D's perspective:
E log D(x) is the log likelihood of real data
E log (1 - D(G(z))) is the dis likelihood of the fake data
middle point
after more training process
all 5 -> wrong; not confident enough -> wrong
should not include label information in training GAN
given Y; add Y to the discriminator and generator
single GAN capable of
instead of generating random numbers, conditional GAN can generate numbers with the same label
4GAN for different scales; 1GAN and 3 Conditional GAN; output of the last GAN is the condition of the current GAN
make problem simpler (steps), rather than e2e, then it's simpler to train
这个应该只是generator的部分
GAN just provides a loss function. Anyway to produce the image can be acceptable.
GAN for classification
V -> vanilla GAN loss
V_I -> info GAN's loss
I -> mutual information -> do the GAN, but also be able to recover
going back and have a L2 loss -> a little similar to VAE
info GAN -> L2 between original encoding and the new encoding
hidden -> output -> hidden
demand the decoding to have a meaning
what is partial encoder???
each hidden layer element has a meaning
eg. one dim -> rotation
another -> bold
but why??? why restrain the L2 loss of c can make the hidden layer meaningful???
VAE: encode -> decode, and the encoding obeys a Gaussian
AAE:
force the hidden layer z's distribution q(z|x) to be a gaussian?
then we can resample the from a guassian in hidden layer, and decode it to generate images
AAE has a GAN way (discriminator) to decrease the KL divergence of the prior p(z) and the q(z|x)
AAE -> approaching the prior sharply
VAE -> try but not follow the prior exactly
(discriminator vs KL divergence)
discriminator now receives two sources of information: one is X (high dimensional data) and one is Z (latent space value)
translate between images
why is GAN is required to do this
why not just an AE?
because there's no horses and zebras in the same position!!!
there's just a distribution of horse pictures and zebras.
为什么人这么少?????