tensorflow - RNN学习文章的风格去仿写

2018-01-13 本文已影响87人 DayDayUpppppp

1 . 能干什么

在知乎，博客上面会看到有人分享自己的一些有意思的项目，比如下面这些，用rnn学习一个诗歌，散文，党章，小说什么的。然后，在自己生成一些东西。比如，下面的这两个例子。作为一个初学者，自己也想搭一个模型，然后试着玩一玩。

image.png

关于RNN的模型结构的基本原理，公式推导什么的，就先不在这里展开了。下面写得是，如何用tensorflow去实现这个模型。

2. 使用tensorflow 搭建这个RNN的模型去实现

读文件，将文件里面的句子分割成词，然后变成一个list。

## 预处理数据
def read_file():
    # Text file containing words for training
    training_file = 'belling_the_cat.txt'
    content=[]
    with open(training_file,'r') as f:
        for line in f.readlines():
            # line 表示读到数据的每一行，linelist是按照空格切分成一个list
            linelist=line.strip().split()
            for i in linelist:
                content.append(i.strip())
    content=np.array(content)
    content=np.reshape(content,[-1,])  #shape (204,1)
    return content

建立字典，建立字典的目的是实现将词转换成词向量
字典的目的是实现word-->向量，这个没有使用word2vec的算法。使用本书文本数据的词建立的向量。简化的过程，如下：

#文本数据
hello ml hello dl
#去重以后的词汇
hello ml dl
#建立字典
{'hello':0 , 'ml':1 , 'dl':2 }
#反向字典
{0:'hello' ,  1:'ml' , 2: 'dl'}

代码如下：

def mybuild_dataset(words):  
    # words -- > ['hello','hello','world','python','tensorflow','rnn']
    count = collections.Counter(words)  
    # Counter({'hello': 2, 'python': 1, 'rnn': 1, 'tensorflow': 1, 'world': 1})
    dictionary=dict()
    for key in count:
        dictionary[key]=len(dictionary)
    #dictionary -- > {'hello': 0, 'python': 3, 'rnn': 1, 'tensorflow': 2, 'world': 4}
    reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
    #reverse_dictionary -- > {0: 'hello', 1: 'rnn', 2: 'tensorflow', 3: 'python', 4: 'world'}
    return dictionary, reverse_dictionary  #len(dictionary)  --> 112

建立RNN模型训练

理解一个模型，我觉得一个比较好的方式是先搞明白它训练的输入和输出。在这个模型里面，处理训练数据的输入和输出，如下所示：

#训练数据
the mice had a xxxxx

# 训练的思路是，将前三个数据作为训练的输入
[[the],[mice],[had]]  --> 转化为词向量

#将第四个词汇作为输出
['a']
#然后将输出的词汇转换成onehot的形式
[ 0,0,0,0 ...0,1,0,0  ...  0,0,0 ]   #长度是词向量字典的长度 , 置1的地方是词向量字典里面‘a’对应的输出设置为1

具体实现的代码，如下：

3.1 RNN模型的训练数据的输入

n_input = 3
offset = 是一个随机的偏移，这个是程序设计上面的一个trick，不影响理解这一段代码

# 输入x ，将前三个词汇转换成词向量
# symbols_in_keys  是一个二维的list -->  [[34], [92], [85]]
symbols_in_keys = [[dictionary[ str(training_data[i])]] for i in range(offset, offset+n_input) ]

# reshape  把他们转换成 (1, 3, 1)
symbols_in_keys = np.reshape(np.array(symbols_in_keys), [-1, n_input, 1])

3.2 RNN模型的训练数据的输出

# 这一段代码搞定是 y_true ，把第四个词转换成词向量 onehot的类型
symbols_out_onehot = np.zeros([vocab_size], dtype=float)

# str(training_data[offset+n_input])  ->  'mice'
symbols_out_onehot[dictionary[str(training_data[offset+n_input])]] = 1.0
symbols_out_onehot = np.reshape(symbols_out_onehot,[1,-1])

3.3 RNN模型的训练

def RNN(x, weights, biases):
    batch_size=1
    x = tf.reshape(x, [batch_size,n_input,1])          # (1,3,1) 相当于batch =1 
    # rnn 
    cell = tf.contrib.rnn.BasicLSTMCell(n_hidden)
    init_state = cell.zero_state(batch_size, dtype=tf.float32)
    # final_state 的维度是  batch * n_hidden                       --> 1 * 512
    # outputs     的维度是  batch * n_input(time_step) * n_hidden  --> 1 * 3  * 512
    outputs, final_state = tf.nn.dynamic_rnn(cell, x, initial_state=init_state, time_major=False)  
    
    #print ("before unstack , output shape : ",outputs.shape)   # output shape :  (1,3,512) (batch,time_step,cell_n_hidden)
    #unstack 更改维度
    outputs = tf.unstack(tf.transpose(outputs, [1,0,2]))
    #这个时候 outputs 变成了list 
    #print ("output shape[-1] 2: ",outputs[-1].shape)           # output shape :  (3,1,512), outputs[-1] shape (1,512)
    results = tf.matmul(outputs[-1], weights['out']) + biases['out']
    #(1,112)  这个的表示意义是一个(1,112)的onehot，112表示字典里面总共有112个词汇
    return results   #(1, 112)  这个表示的是一个onehot

完整代码
https://github.com/zhaozhengcoder/Machine-Learning/tree/master/tensorflow_tutorials/ RNN学习的目录下面

训练的数据使用了伊索寓言的数据，但是数据的量很少，迭代的次数也很少。可以更换成其他的数据，然后增大迭代的次数。

tensorflow - RNN学习文章的风格去仿写

1 . 能干什么

2. 使用tensorflow 搭建这个RNN的模型去实现

猜你喜欢

热点阅读