Recursive AutoEncoder

2019-03-11 本文已影响0人 _Megamind_

AutoEncoder（自编码器）顾名思义就是用来训练从输入到输出一致的模型，中间层hidden layer可以训练出能够代表输入数据的特征向量

AutoEncoder

当输入数据是序列时，由于序列内每个单独的点之间是有一定的潜在联系，如果单纯用AutoEncoder来训练就会丢失这部分信息，因此就有了Recursive AutoEncoder
当初注意到Recursive AutoEncoder(RAE)这个模型是在处理电子病历中诊断编码向量化的问题时，发现每个病人的单次就诊中的诊断编码集合其实也是个序列，有主次诊断的差别，因此不能直接用AutoEncoder来处理（本意是想训练出可以代表一次诊断的特征向量）
介绍 RAE 的原文中提到 "For instance, while the two
phrases “white blood cells destroying an infection”
and “an infection destroying white blood cells” have
the same bag-of-words representation, the former is
a positive reaction while the later is very negative."

相同 词袋模型 的两句话语境其实上不太一样
RAE 的模型架构

Recursive AutoEncoder
RAE 的 Input & Output

Input : 一个可变长的序列（长度至少为2）
Output :
1）在第一个特征向量的位置， Output等同于Input的两个序列点（ $x_3$ & $x_4$ ）
2）其他位置，对应每个位置上Input的一个序列点（ $x_2$ 等）以及前一个位置上的特征向量（ $y_1$ 等）

RAE 的 loss function

比较每个位置上Input和Output的交叉熵

RAE 代码实现（tensorflow）

def build_model(self, options):

        input = tf.placeholder(tf.float32, [None, options["max_seq_length"], options["inputSize"]], name="input")
        inputLayer = tf.einsum("jkl,lm->jkm", input, self.L, name="inputLayer")
        mask = tf.placeholder(tf.float32, [None, options["max_seq_length"] - 1], name="mask")

        p = tf.nn.relu(
            tf.nn.bias_add(tf.einsum("jl,lm->jm", tf.concat([inputLayer[:, 0], inputLayer[:, 1]], 1), self.W_1),
                           self.b_1), name="p")
        p = tf.div(p, tf.reshape(tf.norm(p, axis=1), (-1, 1)))

        self.p = [p]

        c_ = tf.nn.relu(tf.nn.bias_add(tf.einsum("jl,lm->jm", p, self.W_2), self.b_2), name="c_")
        c1 = c_[:, :options["embSize"]]
        c2 = c_[:, options["embSize"]:]

        cost = [tf.add(tf.reduce_mean(tf.square(tf.subtract(c1, inputLayer[:, 0])) / 2, axis=1),
                       tf.reduce_mean(tf.square(tf.subtract(c2, inputLayer[:, 1])) / 2, axis=1))]

        for _ in range(2, options["max_seq_length"]):
            c = tf.concat([inputLayer[:, _], p], 1)
            p_ = tf.nn.relu(tf.nn.bias_add(tf.einsum("jk,kl->jl", c, self.W_1), self.b_1), name="p_")
            c_ = tf.nn.relu(tf.nn.bias_add(tf.einsum("jk,kl->jl", p_, self.W_2), self.b_2), name="c_")
            c1 = c_[:, :options["embSize"]]
            c2 = c_[:, options["embSize"]:]
            cost_ = tf.add((1 / (1 + _) * tf.reduce_mean(tf.square(tf.subtract(c1, inputLayer[:, _])), axis=1)),
                           (_ / (1 + _) * tf.reduce_mean(tf.square(tf.subtract(c2, p)), axis=1)), name="cost_")
            p = p_
            cost = tf.concat([cost, [cost_]], 0)
            self.p = tf.concat([self.p, [p]], 0)

        cost = tf.transpose(cost, perm=[1, 0], name="cost")
        all_cost = tf.reduce_sum(cost * mask, axis=1) / tf.reduce_sum(mask, axis=1, name="all_cost")
        optimizer = tf.train.AdadeltaOptimizer().minimize(all_cost)

        return input, mask, all_cost, optimizer

Recursive AutoEncoder

猜你喜欢

热点阅读