基于tensorflow的MNIST手写数字识别（三）--神经

2016-06-17 本文已影响8316人会打代码的扫地王大爷

基于tensorflow的MNIST手写字识别（一）--白话卷积神经网络模型

基于tensorflow的MNIST手写数字识别（二）--入门篇

基于tensorflow的MNIST手写数字识别（三）--神经网络篇

想想还是要说点什么

抱歉啊，第三篇姗姗来迟，确实是因为我懒，而不是忙什么的，所以这次再加点料，以表示我的歉意。废话不多说，我就直接开始讲了。

加入神经网络的意义

1. 前面也讲到了，使用普通的训练方法，也可以进行识别，但是识别的精度不够高，因此我们需要对其进行提升，其实MNIST官方提供了很多的组合方法以及测试精度，并做成了表格供我们选用，谷歌官方为了保证教学的简单性，所以用了最简单的卷积神经网络来提升这个的识别精度，原理是通过强化它的特征（比如轮廓等），其实我也刚学，所以能看懂就说明它确实比较简单。

2. 我的代码都是在0.7版本的tensorflow上实现的，建议看一下前两篇文章先。

流程和步骤

其实流程跟前面的差不多,只是在softmax前进行了卷积神经网络的操作，所也就不仔细提出了，这里只说卷积神经网络的部分。

如第一篇文章所说，我们的卷积神经网络的，过程是卷积->池化->全连接.

# 卷积函数

# convolution

defconv2d(x, W):

return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')

#这里tensorflow自己带了conv2d函数做卷积，然而我们自定义了个函数，用于指定步长为1，边缘处理为直接复制过来

# pooling

defmax_pool_2x2(x):

return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')

tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)

Computes a 2-D convolution given 4-D input and filter tensors.

Given an input tensor of shape [batch, in_height, in_width, in_channels] and a filter / kernel tensor of shape [filter_height, filter_width, in_channels, out_channels], this op performs the following:

Flattens the filter to a 2-D matrix with shape [filter_height * filter_width * in_channels, output_channels].

Extracts image patches from the the input tensor to form a virtual tensor of shape [batch, out_height, out_width, filter_height * filter_width * in_channels].

For each patch, right-multiplies the filter matrix and the image patch vector.

In detail,

output[b, i, j, k] =

sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] * filter[di, dj, q, k]

Must have strides[0] = strides[3] = 1. For the most common case of the same horizontal and vertices strides, strides = [1, stride, stride, 1].

Args:

input: A Tensor. Must be one of the following types: float32, float64.

filter: A Tensor. Must have the same type as input.

strides: A list of ints. 1-D of length 4. The stride of the sliding window for each dimension of input.

padding: A string from: “SAME”, “VALID”. The type of padding algorithm to use.

use_cudnn_on_gpu: An optional bool. Defaults to True.

name: A name for the operation (optional).

Returns:

A Tensor. Has the same type as input.

tf.nn.max_pool(value, ksize, strides, padding, name=None)

Performs the max pooling on the input.

Args:

value: A 4-D Tensor with shape [batch, height, width, channels] and type float32, float64, qint8, quint8, qint32.

ksize: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor.

strides: A list of ints that has length >= 4. The stride of the sliding window for each dimension of the input tensor.

padding: A string, either ‘VALID’ or ‘SAME’. The padding algorithm.

name: Optional name for the operation.

Returns:

A Tensor with the same type as value. The max pooled output tensor.

初始化权重和偏置值矩阵，值是空的，需要后期训练。

def weight_variable(shape):

initial = tf.truncated_normal(shape,stddev=0.1)

return tf.Variable(initial)

def bias_variable(shape):

initial = tf.constant(0.1, shape = shape)

# print(tf.Variable(initial).eval())

return tf.Variable(initial)

#这是做了两次卷积和池化

h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1)

h_pool1 = max_pool_2x2(h_conv1)

h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2)

h_pool2 = max_pool_2x2(h_conv2)