keras FAQ

2019-07-30 本文已影响0人 zestloveheart

介绍

这里记录keras文档FAQ中在工作中用到的一些问题和技巧。参考自这里
主要包括：

多GPU训练
获取中间层的输出
冻结（freeze）某些层

多GPU运行

运行一个模型在多个gpu上有两种方法：数据并行、设备并行

数据并行

数据并行是将一个模型在每个GPU上都部署一份进行训练，同时处理，加速训练。
keras有内置的工具keras.utils.multi_gpu_model，该模块可以为任何自定义模型产生一个数据并行模型，在多gpu上达到线性拟合加速（quasi-linear speedup）。
更多可以参考multi_gpu_model
这里给出一个例子

from keras.utils import multi_gpu_model
parallel_model = multi_gpu_model(model, gpus=8)
parallel_model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
# This `fit` call will be distributed on 8 GPUs.
parallel_model.fit(x, y, epochs=20, batch_size=256) # batch size: 256, each GPU will process 32 samples.

设备并行

设备并行是在不同的GPU上运行一个模型的多个分支，多用于模型中有多个并行的结构，例如AlexNet的卷积就是放到多个GPU上运行的。提供一个例子

# Model where a shared LSTM is used to encode two different sequences in parallel
input_a = keras.Input(shape=(140, 256))
input_b = keras.Input(shape=(140, 256))
shared_lstm = keras.layers.LSTM(64)
# Process the first sequence on one GPU
with tf.device_scope('/gpu:0'):
    encoded_a = shared_lstm(tweet_a)
# Process the next sequence on another GPU
with tf.device_scope('/gpu:1'):
    encoded_b = shared_lstm(tweet_b)
# Concatenate results on CPU
with tf.device_scope('/cpu:0'):
    merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)

如何获取某一层的输出

从Model中获取输出

创建一个模型，直接输出模型预测的结果。如下。

from keras.models import Model
model = ...  # create the original model
layer_name = 'my_layer'
intermediate_layer_model = Model(inputs=model.input,
                                 outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model.predict(data)

使用keras function

from keras import backend as K
# with a Sequential model
get_3rd_layer_output = K.function([model.layers[0].input], [model.layers[3].output])
layer_output = get_3rd_layer_output([x])[0]

如果模型有dropout、BN

如果模型有dropout、BN这种训练期有效、测试期无效的层，需要给一个指标（flag）。如下

get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()], [model.layers[3].output])
# output in test mode = 0
layer_output = get_3rd_layer_output([x, 0])[0]
# output in train mode = 1
layer_output = get_3rd_layer_output([x, 1])[0]

如何冻结（freeze）某些层

冻结代表在训练时期，某一些层的参数是不变的。这个多用于微调模型。
只需要在创建某一层的时候设定trainable参数为False。
frozen_layer = Dense(32, trainable=False)
或者在创建之后设定，如下。

x = Input(shape=(32,))
layer = Dense(32)
layer.trainable = False
y = layer(x)

frozen_model = Model(x, y)
# in the model below, the weights of `layer` will not be updated during training
frozen_model.compile(optimizer='rmsprop', loss='mse')

layer.trainable = True
trainable_model = Model(x, y)
# with this model the weights of the layer will be updated during training
# (which will also affect the above model since it uses the same layer instance)
trainable_model.compile(optimizer='rmsprop', loss='mse')

frozen_model.fit(data, labels)  # this does NOT update the weights of `layer`
trainable_model.fit(data, labels)  # this updates the weights of `layer`