001-线性回归

2021-01-15  本文已影响0人  不懂球的2大业

1.基本原理

1.1概念:

最小二乘法

1.2参数学习方法:

1.2.1最小二乘法:

由于线性回归的标签y和模型输出都为连续的实数值,因此平方损失函数非常合适。根据经验风险最小化准则,训练集D上的经验风险定义为:

\begin{equation}\begin{split} R(w) &= \sum_{i=1}^{N} L(y^{(i)},f(x^{(i)};w)) \\ &= \frac {1}{2} \sum_{i=1}^{N}(y^{(i)}-w^{T}x^{(i)})^{2} \\ &= \frac{1}{2}||y-x^{T}w ||^{2} \end{split}\end{equation}
其中y = [y^{(1)},...,y^{(N)}]^{T} \in R^{N}是由所有样本的真实标签组成的列向量,而x \in R^{(D+1)*N}是所有样本的输入特征x^{(1)},...,x^{(N)}组成的矩阵:
\begin{pmatrix} x_{1}^{(1)}&x_{1}^{(2)}&\cdots & x_{1}^{(N)}\\ \vdots&\vdots&\ddots&\vdots\\ x_{D}^{(1)}&x_{D}^{(2)}&\cdots & x_{D}^{(N)}\\ 1&1&\cdots &1\\ \end{pmatrix}
风险函数R(w)是关于w的凸函数,其对w的偏导数为(结果的形状(D+1)*1):
\begin{equation}\begin{split} \frac{\partial R(w)}{\partial w} &= \frac {1}{2} \frac {\partial || y - x^{T}w||^{2}}{\partial w} \\ &= -x(y-x^{T}w) \end{split}\end{equation}
令导数等于0,即\frac {\partial}{\partial w} R(w) = 0得到最优参数为:
\begin{equation}\begin{split} w^{\ast} = (xx^T)^{-1}xy \end{split}\end{equation}

1.2.2梯度下降法:

在最小二乘法中,xx^{T} \in R^{(D+1)*(D+1)}必须存在逆矩阵,即xx^{T}是满秩的。当xx^{T}不可逆时,可以使用梯度下降法来估计参数。先初始化w = 0,然后通过下面公式进行迭代:
\begin{equation}\begin{split} w \leftarrow w+\alpha x(y-x^{T}w) \end{split}\end{equation}

2.编程实现

2.1最小二乘法

class LinearRegression:
    def __init__(self):
        self.basis_func = None
        self.phi0 = None
        self.phi1 = None
        self.phi = None
        self.w = None
    
    def identity_basis(self,x):
        ret = np.expand_dims(x,axis=1)
        return ret
    
    def fit(self,x_train,y_train):
        self.basis_func = identity_basis
        self.phi0 = np.expand_dims(np.ones_like(x_train),axis = 1)
        self.phi1 = self.basis_func(x_train)
        self.phi = np.concatenate([self.phi0,self.phi1],axis=1)
        self.w = np.dot(np.linalg.pinv(self.phi),y_train)
        
    def predict(self,x):
        phi0 = np.expand_dims(np.ones_like(x), axis=1)
        phi1 = self.basis_func(x)
        phi = np.concatenate([phi0, phi1], axis=1)
        y = np.dot(phi, self.w)
        return y
    
    def evaluate(self,y_predict, y_true):
        std = np.sqrt(np.mean(np.abs(y_predict - y_true) ** 2))
        return std

2.2梯度下降法

class LinearRegression:
    def __init__(self):
        self.basis_func = None
        self.phi0 = None
        self.phi1 = None
        self.phi = None
        self.w = None
    
    def identity_basis(self,x):
        ret = np.expand_dims(x,axis=1)
        return ret
    
    def derivation(self,theta,phi,y):
        return phi.T.dot(phi.dot(theta)-y)*2.0/len(phi)
    
    def gradient(self,phi,y,initial_theta,eta=0.0001,n_iters = 10000):
        w = initial_theta
        for i in range(n_iters):
            grad = self.derivation(w,phi,y)
            w = w - eta*grad
        return w
    
    def fit(self,x_train,y_train):
        self.basis_func = identity_basis
        self.phi0 = np.expand_dims(np.ones_like(x_train),axis = 1)
        self.phi1 = self.basis_func(x_train)
        self.phi = np.concatenate([self.phi0,self.phi1],axis=1)
        initial_theta = np.zeros(self.phi.shape[1])
        self.w = self.gradient(self.phi,y_train,initial_theta)

    def predict(self,x):
        phi0 = np.expand_dims(np.ones_like(x), axis=1)
        phi1 = self.basis_func(x)
        phi = np.concatenate([phi0, phi1], axis=1)
        y = np.dot(phi, self.w)
        return y
    
    def evaluate(self,y_predict, y_true):
        std = np.sqrt(np.mean(np.abs(y_predict - y_true) ** 2))
        return std

参考文献:
1.邱锡鹏,神经网络与深度学习,机械工业出版社,https://nndl.github.io/, 2020.
2.https://www.cnblogs.com/cxq1126/p/13293262.html

上一篇 下一篇

猜你喜欢

热点阅读