Lecture 22 | (1/2) Generative Ad

2019-10-29 本文已影响0人 Ysgc

https://www.youtube.com/watch?v=1irBjosv_kE&list=PLp-0K3kfddPzNdZPX4p0lVi6AcDXBofuf&index=22
(old)

change the facial expressions; understand the img

1 img compared with another img -> L1/L2 loss ...

100 imgs compared with another 100 imgs -> discrimator

it's differentiable

from D's perspective:
E log D(x) is the log likelihood of real data
E log (1 - D(G(z))) is the dis likelihood of the fake data

middle point

after more training process

all 5 -> wrong; not confident enough -> wrong

should not include label information in training GAN

given Y; add Y to the discriminator and generator

single GAN capable of

instead of generating random numbers, conditional GAN can generate numbers with the same label

4GAN for different scales; 1GAN and 3 Conditional GAN; output of the last GAN is the condition of the current GAN

make problem simpler (steps), rather than e2e, then it's simpler to train

这个应该只是generator的部分

GAN just provides a loss function. Anyway to produce the image can be acceptable.

GAN for classification

V -> vanilla GAN loss
V_I -> info GAN's loss

I -> mutual information -> do the GAN, but also be able to recover
going back and have a L2 loss -> a little similar to VAE
info GAN -> L2 between original encoding and the new encoding
hidden -> output -> hidden
demand the decoding to have a meaning

what is partial encoder???

each hidden layer element has a meaning

eg. one dim -> rotation
another -> bold

but why??? why restrain the L2 loss of c can make the hidden layer meaningful???

VAE: encode -> decode, and the encoding obeys a Gaussian
AAE:

force the hidden layer z's distribution q(z|x) to be a gaussian?
then we can resample the from a guassian in hidden layer, and decode it to generate images

AAE has a GAN way (discriminator) to decrease the KL divergence of the prior p(z) and the q(z|x)

AAE -> approaching the prior sharply
VAE -> try but not follow the prior exactly
(discriminator vs KL divergence)

discriminator now receives two sources of information: one is X (high dimensional data) and one is Z (latent space value)

translate between images

why is GAN is required to do this
why not just an AE?
because there's no horses and zebras in the same position!!!

there's just a distribution of horse pictures and zebras.

为什么人这么少？？？？？

Lecture 22 | (1/2) Generative Ad

猜你喜欢

热点阅读