《How to Construct Deep Recurrent
论文链接:https://arxiv.org/pdf/1312.6026.pdf
扩展RNN成为一个deep RNN。RNN的三点可以变得deeper:1)input-to-hidden function 2)hidden-to-hidden transition 3)hidden-to-output function.
RNN在建模变长序列时是很受欢迎的选择,而RNN的深度本来就是模棱两可的。
In one sense, if we consider the existence of a composition of several nonlinear computational layers in a neural network being deep, RNNs are already deep, since any RNN can be expressed as a composition of multiple nonlinear layers when unfolded in time.
2. Recurrent Neural Networks
模拟a discrete-time dynamical system时间离散的动力系统 that has an input xt, an output yt and a hidden state ht.
RNNfh和fo代表state transition函数和一个输出函数。
parametersN training sequences:
训练序列RNN的参数通过最小化如下的cost function:
cost functiond(a, b)是提前定义好的divergence measure(散度测度?距离吧.....或者相似度), 例如欧氏距离或交叉熵。
2.1 Conventional Recurrent Neural Network传统的RNN
transition function and the output functionW,U和V分别是transition,input和output矩阵。
element-wise nonlinear functions3 Deep Recurrent Neural Network
3.1 why deep recurrent neural network?
一个假设
Deep learning is built around a hypothesis that a deep, hierarchical model can be exponentially more efficient at representing some functions than a shallow one.
3.2 Depth of a Recurrent Neural Network
The depth is defined in the case of feedforward neural networks as having multiple nonlinear layers between input and output.
这个定义不适用于RNN,因为RNN在时间上的结构。例如,任何RNN在时间上展开为图1所示,because a computational path between the input at time k < t to the output at time t crosses several nonlinear layers.
3.2.1 Deep input-to-hidden function
A model can exploit more non-temporal structure from the input by making the input-to-hidden function deep.
3.2.2 Deep Hidden-to-Output Function
A deep hidden-to-output function can be useful to disentangle the factors of variations in the hidden state, making it easier to predict the output.
deep transition RNN:
deep transition RNNstacked RNN
stacked RNNNeural Operation: