Tensorflow | 踩坑纪实

2018-07-03 本文已影响0人 shawn233

1 计算损失函数时，logits参数是什么

损失函数的计算涉及到Tensorflow的两个常用函数：

tf.nn.sigmoid_cross_entropy_with_logits(labels, logits)
tf.nn.softmax_cross_entropy_with_logits_v2(labels, logits) 
#写这段文字时softmax_cross_entropy_with_logits被明确标为即将弃用的不建议函数

当我初次使用Tensorflow时，我天真的认为logits就是预测值（我一般称之为y_hat）。其实不然，参数logits为输出层的激活值（activation），与预测值的关系通常为：

y_hat = tf.nn.sigmoid (activation)
y_hat = tf.nn.softmax (activation)

2 用sigmoid作为输出层激活函数时，损失函数无法降为0

这很有可能时因为你的标签不是0或1，而是0到1间的一个数。根据交叉熵损失函数的定义，只有logit和label同时为0或同时为1时，其值为0。因此，在这labels不为0的情况下，损失函数的值是无法达到0的。

3 MNIST怎么使用

4 tf.nn.softmax_cross_entropy_with_logits_v2函数认为哪个维度是类别维度？

在一次跑mnist的实验中，我把类别维放在第一维（dim=0），并使用了tf.nn.softmax_cross_entropy_with_logits_v2计算熵，结果：

Epoch    1: cost=1611.621314344
Epoch    6: cost=17532.561093204
Epoch   11: cost=33408.859425080

可见，模型不收敛。几小时后，我发现了问题：tf.nn.softmax_cross_entropy_with_logits_v2默认类别维度为最后一维（dim=-1）。查阅文档：

tf.nn.softmax_cross_entropy_with_logits_v2(
    _sentinel=None,
    labels=None,
    logits=None,
    dim=-1,
    name=None
)

Args:

_sentinel: Used to prevent positional parameters. Internal, do not use.
labels: Each vector along the class dimension should hold a valid probability distribution e.g. for the case in which labels are of shape [batch_size, num_classes], each row of labels[i] must be a valid probability distribution.
logits: Unscaled log probabilities.
dim: The class dimension. Defaulted to -1 which is the last dimension.
name: A name for the operation (optional).

从文档中我们知道了，参数dim指定了类别的维度。因此在使用这个函数时，记得指定dim参数。添加dim=0后，实验结果：

单层Softmax
Epoch    1: cost=1.747955866
Epoch    6: cost=0.334705674
Epoch   11: cost=0.291128567
Epoch   16: cost=0.271443650
Epoch   21: cost=0.266406643
Epoch   26: cost=0.259507637
Epoch   31: cost=0.255616793
Epoch   36: cost=0.252413219
Epoch   41: cost=0.254052425
Epoch   46: cost=0.254678538
Opitimization Finished!

三层softmax
Epoch    1: cost=9.467425473
Epoch    6: cost=2.299335675
Epoch   11: cost=1.818574688
Epoch   16: cost=0.992218607
Epoch   21: cost=0.570668126
Epoch   26: cost=0.229845069
Epoch   31: cost=0.150805521
Epoch   36: cost=0.119380749
Epoch   41: cost=0.101064101
Epoch   46: cost=0.082242706
Opitimization Finished!

祝大家的模型永远收敛~

5 tf.argmax的作用是什么

官方文档：

tf.argmax(
    input,
    axis=None,
    name=None,
    dimension=None,
    output_type=tf.int64
)

Returns the index with the largest value across axes of a tensor. (deprecated arguments)

Args:

input: A Tensor. Must be one of the following types: float32, float64, int32, uint8, int16, int8, complex64, int64, qint8, quint8, qint32, bfloat16, uint16, complex128, half, uint32, uint64.
axis: A Tensor. Must be one of the following types: int32, int64. int32 or int64, must be in the range [-rank(input), rank(input)). Describes which axis of the input Tensor to reduce across. For vectors, use axis = 0.
output_type: An optional tf.DType from: tf.int32, tf.int64. Defaults to tf.int64.
name: A name for the operation (optional).

根据文档我们知道，tf.argmax返回张量在某个方向上最大值的下标，运算后会使原张量降维。第二个参数axis指定了进行这个运算的维度，也就是因运算而消减的那个维度。

Tensorflow | 踩坑纪实

1 计算损失函数时，logits参数是什么

2 用sigmoid作为输出层激活函数时，损失函数无法降为0

3 MNIST怎么使用

4 tf.nn.softmax_cross_entropy_with_logits_v2函数认为哪个维度是类别维度？

5 tf.argmax的作用是什么

Args:

猜你喜欢

热点阅读