Chapter4_神经网络的学习

2020-02-04  本文已影响0人  叫我阿杭就好了

神经网络的学习

损失函数

均方误差

E = \frac{1}{2}\sum_k(y_k-t_k)^2
y_k表示神经网络的输出,t_k表示监督数据,k表示数据的维数

import numpy as np

#均方误差的实现
def mean_squared_error(y,t):
    return 0.5*np.sum((y-t)**2)
t = [0,0,1,0,0,0,0,0,0,0]
y = [0.1,0.05,0.6,0.0,0.05,0.1,0.0,0.1,0.0,0.0]
mean_squared_error(np.array(y),np.array(t))
0.09750000000000003

交叉熵误差

E = -\sum_kt_klogy_k

#交叉熵误差实现
#y:1*n,t:1*n
def cross_entropy_error(y,t):
    delta = 1e-7
    return -np.sum(t*np.log(y+delta))

mini-batch学习

#mini-batch版交叉熵误差的实现
def cross_entropy_error(y,t):
    if y.ndim==1:
        t = t.reshape(1,t.size)
        y = y.reshape(1,y.size)
        
    batch_size = y.shape[0]
    return -np.sum(t*np.log(y+1e-7))/batch_size

为何要设定损失函数

数值微分

导数

\frac{df(x)}{dx}=\lim_{h\to 0}\frac{f(x+h)-f(x)}{h}

偏导数

\frac{\partial f}{\partial x_0},\frac{\partial f}{\partial x_1}

梯度

(\frac{\partial f}{\partial x_0},\frac{\partial f}{\partial x_1})由全部变量的偏导数汇总而成的向量称为梯度

#梯度计算(可以计算多维)
def numerical_gradient(f, x):
    h = 1e-4 # 0.0001
    grad = np.zeros_like(x)
    
    it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
    while not it.finished:
        idx = it.multi_index
        tmp_val = x[idx]
        x[idx] = float(tmp_val) + h
        fxh1 = f(x) # f(x+h)
        
        x[idx] = tmp_val - h 
        fxh2 = f(x) # f(x-h)
        grad[idx] = (fxh1 - fxh2) / (2*h)
        
        x[idx] = tmp_val # 还原值
        it.iternext()   
        
    return grad

梯度法

x_0 = x_0-\eta \frac{\partial f}{\partial x_0} \\x_1 = x_1-\eta \frac{\partial f}{\partial x_1}
\eta称为学习率(learning rate),决定在一次学习中,应该学习多少,以及在多大程度上更新参数.

#梯度下降法的实现
def gradient_descent(f,init_x,lr=0.01,step_num=100):
    x = init_x
    for i in range(step_num):
        grad = numerical_gradient(f,x)
        x -= lr*grad
    
    return x

神经网络的梯度

损失函数关于权重参数的梯度
W= \begin{pmatrix} \omega_{11}&\omega_{12}&\omega_{13}\\ \omega_{21}&\omega_{22}&\omega_{23} \end{pmatrix} \\ \frac{\partial L}{\partial W}=\begin{pmatrix} \frac{\partial L}{\partial \omega_{11}}&\frac{\partial L}{\partial \omega_{12}}&\frac{\partial L}{\partial \omega_{13}}\\ \frac{\partial L}{\partial \omega_{21}}&\frac{\partial L}{\partial \omega_{22}}&\frac{\partial L}{\partial \omega_{23}} \end{pmatrix}

from sourcecode.common.functions import softmax,cross_entropy_error
from sourcecode.common.gradient import numerical_gradient
class simpleNet:
    def __init__(self):
        self.W = np.random.randn(2,3)#用高斯分布进行初始化
    
    def predict(self,x):
        return np.dot(x,self.W)
    
    def loss(self,x,t):
        z =self.predict(x)
        y = softmax(z)
        loss = cross_entropy_error(y,t)
        
        return loss
net=simpleNet()
print(net.W)
x = np.array([0.6,0.9])
p = net.predict(x)
print(p)
print(np.argmax(p))
t = np.array([0,0,1])#正确解标签
print(net.loss(x,t))
[[ 0.96028135 -1.10055385 -1.26426151]
 [ 0.4756395   1.3477234   0.45475418]]
[ 1.00424436  0.55261875 -0.34927815]
0
1.992699002936635
#求梯度
def f(W):
    return net.loss(x,t)
dW = numerical_gradient(f,net.W)
print(dW)
[[ 0.31663563  0.20156786 -0.51820349]
 [ 0.47495345  0.30235179 -0.77730524]]

学习算法的实现

  1. 前提
    神经网络存在合适的权重和偏置.
  2. mini-batch
    从训练数据中随机选出一部分数据.
  3. 计算梯度
    求出各个权重参数的梯度
  4. 更新参数
    将权重参数沿梯度方向进行微小更新
  5. 重复2.3.4
上一篇 下一篇

猜你喜欢

热点阅读