VAE | WAE
https://juejin.im/post/598972735188256de4693951
https://cloud.tencent.com/developer/article/1096650
我们想要的是构建一个生成式模型,而非仅仅是“记忆”图像数据的模糊结构。除了像前面那样从已有图像中编码出潜在向量,我们还不知道如何创造这些向量,也就无法凭空生成任何图像。
给编码网络增加一个约束,迫使它所生成的潜在向量大体上服从于单位高斯分布。
误差项:生成误差,用以衡量网络重构图像精确度的均方误差;潜在误差,用以衡量潜在变量在单位高斯分布上的契合程度的KL散度。
generation_loss = mean(square(generated_image - real_image))
latent_loss=KL-Divergence(latent_variable, unit_gaussian)
loss = generation_loss + latent_loss
为了优化KL散度,我们要用到重新参数化的一个简单技巧:生成一个均值向量一个标准差向量,而非直接生成实值向量。
image.png注:z_mean and z_stddev are two vectors generated by encoder network
latent_loss = 0.5* tf.reduce_sum(tf.square(z_mean)+tf.square(z_stddev) - tf.log(tf.square(z_stddev))-1,1)
在计算解码网络的误差时,我们只需从标准差中取样,再加上均值向量,就能得到我们的潜在向量:
samples = tf.random_normal([batchsize,n_z],0,1,dtype=tf.float32)
sampled_z = z_mean + (z_stddev * samples)
全部代码:
reset_graph()
from functools import partial
n_inputs = 28 * 28
n_hidden1 = 500
n_hidden2 = 500
n_hidden3 = 20 # codings
n_hidden4 = n_hidden2
n_hidden5 = n_hidden1
n_outputs = n_inputs
learning_rate = 0.001
with tf.contrib.framework.arg_scope(
[fully_connected],
activation_fn=tf.nn.relu,
weights_initializer=tf.contrib.layers.variance_scaling_initializer()):
X = tf.placeholder(tf.float32, [None, n_inputs])
hidden1 = fully_connected(X, n_hidden1)
hidden2 = fully_connected(hidden1,n_hidden2)
hidden3_mean = fully_connected(hidden2,n_hidden3,activation_fn=None)
hidden3_gamma = fully_connected(hidden2,n_hidden3,activation_fn=None)
hidden3_sigma = tf.exp(0.5 * hidden3_gamma)
noise = tf.random_normal(tf.shape(hidden3_sigma),dtype=tf.float32)
hidden3 = hidden3_mean + hidden3_sigma * noise
hidden4 = fully_connected(hidden3,n_hidden4)
hidden5 = fully_connected(hidden4,n_hidden5)
logits = fully_connected(hidden5,n_outputs,activation_fn=None)
outputs = tf.sigmoid(logits)
reconstruction_loss = tf.reduce_mean(
tf.nn.sigmoid_cross_entropy_with_logits(labels=X,logits=logits))
latent_loss = 0.5 * tf.reduce_sum(tf.exp(hidden3_gamma) + tf.square(hidden3_mean)
- 1 - hidden3_gamma)
cost = reconstruction_loss + latent_loss
optimizer = tf.train.AdamOptimizer(learning_rate)
training_op = optimizer.minimize(cost)
init = tf.global_variables_initializer()
saver = tf.train.Saver()
n_epochs = 50
batch_size = 150
with tf.Session() as sess:
init.run()
for epoch in range(n_epochs):
n_batches = mnist.train.num_examples // batch_size
for iteration in range(n_batches):
print("\r{}%".format(100 * iteration // n_batches), end="")
sys.stdout.flush()
X_batch, y_batch = mnist.train.next_batch(batch_size)
sess.run(training_op, feed_dict={X: X_batch})
loss_val, reconstruction_loss_val, latent_loss_val = sess.run([cost, reconstruction_loss, latent_loss], feed_dict={X: X_batch})
print("\r{}".format(epoch), "Train total cost:", loss_val, "\tReconstruction loss:", reconstruction_loss_val, "\tLatent loss:", latent_loss_val)
saver.save(sess, "./my_model_variational_variant.ckpt")
EM距离
https://blog.csdn.net/zhangping1987/article/details/25368183
https://hk.saowen.com/a/e14e3319b2b52da88b7a0fc074b93c153ff17b67e33e9ab07ff298d64d573f7a
代码:
import numpy as np
import cv
#p、q是兩個矩陣,第一列表示權值,後面三列表示直方圖或數量
p=np.asarray([[0.4,100,40,22],
[0.3,211,20,2],
[0.2,32,190,150],
[0.1,2,100,100]],np.float32)
q=np.array([[0.5,0,0,0],
[0.3,50,100,80],
[0.2,255,255,255]],np.float32)
pp=cv.fromarray(p)
qq=cv.fromarray(q)
emd=cv.CalcEMD2(pp,qq,cv.CV_DIST_L2)
L<1 代表 f 是一个 1-Lipsschitz 函数。K-Lipsschitz 函数定义为对于 K >0,||f(x1) – f(x2)||≤K||x1 – x2||。
WGAN
http://www.10tiao.com/html/511/201702/2651993903/4.html