TensorFlow 滑动平均模型

2020-04-25  本文已影响0人  youyuge

滑动平均不会改变训练的参数本身,原来梯度下降多少依旧是多少。滑动平均类只是会生成并维护一个影子变量的合集,每次梯度下降后运行滑动平均op,更新影子变量。影子变量比起原来的变量更平稳,故只会也只能在evaluation阶段使用影子变量,进行accuracy测试。滑动平均后的参数值不被使用于训练阶段,只会用于测试阶段。

The typical scenario for ExponentialMovingAverage is to compute moving
averages of variables during training, and restore the variables from the
computed moving averages during evaluations.

使用案例

    # 定义训练轮数及相关的滑动平均类 
    global_step = tf.Variable(0, trainable=False)
    variable_averages = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step)
    variables_averages_op = variable_averages.apply(tf.trainable_variables())

The apply() method adds shadow copies of trained variables and add ops that
maintain a moving average of the trained variables in their shadow copies.
It is used when building the training model. The ops that maintain moving
averages are typically run after each training step.

with tf.control_dependencies([opt_op]):
    # Create the shadow variables, and add ops to maintain moving averages
    # of var0 and var1. This also creates an op that will update the moving
    # averages after each training step.  This is what we will use in place
    # of the usual training op.
    training_op = ema.apply([var0, var1])

...train the model by running training_op...

之后需要把更新shadow variable的op纳入到训练op中,上面是官网的方式。下方这样也可以:

# 反向传播更新参数和更新每一个参数的滑动平均值
with tf.control_dependencies([train_step, variables_averages_op]):
   train_op = tf.no_op(name='train')

...train the model by running training_op...

tf.no_op代表什么也不干。tf.control_dependencies作为上下文管理器,会确保参数op一定在with语句块执行之前被调用。实现了tf中对Graph模型里流的控制。

上一篇 下一篇

猜你喜欢

热点阅读