UFLDL新版教程与编程练习（一）：Linear Regress

2019-08-06 本文已影响16人赖子啊

UFLDL是吴恩达团队编写的较早的一门深度学习入门，里面理论加上练习的节奏非常好，每次都想快点看完理论去动手编写练习，因为他帮你打好了整个代码框架，也有详细的注释，所以我们只要实现一点核心的代码编写工作就行了，上手快！

第一节就是：Linear Regression（线性回归）
线性回归，顾名思义就是用一个线性的模型去预测。我们就是要用 $\left\{\left(x^{(1)}, y^{(1)}\right), \ldots,\left(x^{(m)}, y^{(m)}\right)\right\}$ 数据去训练一个线性模型或者线性函数：
$h_{\theta}(x)=\sum_{i} \theta_{j} x_{j}=\theta^{\top} x$
使得对每一个训练样本，都能够有这样的效果： $y^{(i)} \approx h\left(x^{(i)}\right)$
我们现在要做的就是:

找到一个需要优化的目标函数，或者叫做损失函数cost function。它用来衡量预测值偏离真实值的情况，这也是监督学习supervised learning 的标志。
找到objective function之后，就要找到使之损失值下降的优化方法，这里最常见的就是：梯度下降（Gradient Descent）。

依照上面两个原则，我们的loss function是这样的，类似L2范数：
$J(\theta)=\frac{1}{2} \sum_{i}\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right)^{2}=\frac{1}{2} \sum_{i}\left(\theta^{\top} x^{(i)}-y^{(i)}\right)^{2}$
而梯度下降法就是，要计算出梯度 $\nabla_{\theta} J(\theta)$ ，代入公式 $J(\theta)=J(\theta)-\nabla_{\theta} J(\theta)$ ，使loss不断减小的过程，我们这里关键要求出梯度，教程里面已经给了公式：
$\nabla_{\theta} J(\theta)=\left[\begin{array}{c}{\frac{\partial J(\theta)}{\partial \theta_{1}}} \\ {\frac{\partial J(\theta)}{\partial \theta_{2}}} \\ {\vdots} \\ {\frac{\partial J(\theta)}{\partial \theta_{n}}}\end{array}\right]$
其中对每一个 $\theta_{j}$ 的偏导数是这样的： $\frac{\partial J(\theta)}{\partial \theta_{j}}=\sum_{i} x_{j}^{(i)}\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right)$

之后练习的话只要我们在linear_regression.m里面把写好的目标函数值赋给 $f$ ，把梯度赋给 $g$ 就可以了，在主脚本ex1_linreg.m中把响应向量化写法的注释掉（这个之后会有一个对应的向量化写法，现在还是用循环实现的）就可以运行了，下面是我的linear_regression.m代码：

function [f,g] = linear_regression(theta, X,y)
  %
  % Arguments:
  %   theta - A vector containing the parameter values to optimize.14 rows,1 column 
  %   X - The examples stored in a matrix.
  %       X(i,j) is the i'th coordinate of the j'th example.
  %   y - The target value for each example.  y(j) is the target for example j.
  %
  
  m=size(X,2);
  n=size(X,1);

  f=0;
  g=zeros(size(theta));

  %
  % TODO:  Compute the linear regression objective by looping over the examples in X.
  %        Store the objective function value in 'f'.
  %
  % TODO:  Compute the gradient of the objective with respect to theta by looping over
  %        the examples in X and adding up the gradient for each example.  Store the
  %        computed gradient in 'g'.
  
%%% YOUR CODE HERE %%%
for i=1:m
    temp = 0;
    for j=1:n
        temp = temp + theta(j) * X(j,i);
    end
    f = f + 0.5 * (temp - y(i))^2;
end

for j=1:n
    for i=1:m
        temp = 0;
        for k=1:n
            temp = temp + theta(k) * X(k,i);
        end
        g(j) = g(j) + X(j,i) * (temp - y(i));
    end
end

这是我的训练结果：

线性回归（非向量化）

线性回归（非向量化）图
与教程中吻合：
(Yours may look slightly different depending on the random choice of training and testing sets.) Typical values for the RMS training and testing error are between 4.5 and 5.

有理解不到位之处，还请指出，有更好的想法，可以在下方评论交流！

UFLDL新版教程与编程练习（一）：Linear Regress

猜你喜欢

热点阅读