Logistic Regression推导过程

2019-08-29  本文已影响0人  prolic

Logistic Regression推导过程

逻辑回归选用Sigmoid函数作为预测函数
h_\theta(x) = g(\theta^Tx),g(z)= \frac{1}{1+e^{-z}}
h_\theta(x) = \frac{1}{1+e^{-\theta^Tx}}
概率函数形式如下
P(y^{(i)}|x^{(i)};\theta)=(h_\theta(x^{(i)}) ^{y^{(i)}}) \cdot (1-h_\theta(x^{(i)}))^{1-y(i)}
似然函数形式如下
\begin{align} L(\theta) &= P(\overrightarrow{Y} | X; \theta) \\ &= \prod_{i=1}^N P(y^{(i)}||x^{(i)};\theta) \\ &= \prod_{i=1}^N (h_\theta(x^{(i)}))^{y^{(i)}}(1-h_\theta(x^{(i)}))^{1-y^{(i)}} \end{align}
便于计算,对数似然函数:
\begin{align} l(\theta) &= \sum_{i=1}^N \log l (\theta) \\ &= \sum_{i=1}^N y^{(i)}\log(h_\theta(x^{(i)})) + (1-y^{(i)})\log(1-h_\theta(x^{(i)})) \end{align}
损失函数
J(\theta) = - \frac{1}{m}[\sum_{i=1}^m (y^{(i)}\log h_\theta(x^{(i)}) + (1-y^{(i)})\log(1-h_\theta(x^{(i)})))]
对J取偏导,步骤如下
\begin{align} \frac{\partial}{\partial\theta_j}J(\theta) &= -\frac{1}{m}\sum_{i=1}^m\left( y^{(i)}\frac{1}{h_\theta(x^{(i)})} \frac{\partial}{\partial\theta_j}h_\theta(x^{(i)})-(1-y^{(i)})\frac{1}{1-h_\theta(x^{(i)})} \frac{\partial}{\partial\theta_j} h_\theta(x^{(i)}) \right) \\ &= -\frac{1}{m}\sum_{i=1}^m\left( y^{(i)}\frac{1}{g(\theta^Tx^{(i)})} - (1-y^{(i)})\frac{1}{1-g(\theta^Tx^{(i)})} \right) \frac{\partial}{\partial\theta_j}g(\theta^Tx^{(i)}) \\ &= -\frac{1}{m}\sum_{i=1}^m\left( y^{(i)}\frac{1}{g(\theta^Tx^{(i)})} - (1-y^{(i)})\frac{1}{1-g(\theta^Tx^{(i)})} \right) g(\theta^Tx^{(i)}) \left(1-g(\theta^Tx^{(i)})\right) \frac{\partial}{\partial\theta_j} \theta^Tx^{(i)} \\ &= -\frac{1}{m}\sum_{i=1}^m\left( y^{(i)} \left(1-g(\theta^Tx^{(i)})\right) - (1-y^{(i)})g(\theta^Tx^{(i)}) \right)x_{j}^{(i)} \\ &= -\frac{1}{m}\sum_{i=1}^m\left( y^{(i)} - g(\theta^Tx^{(i)}) \right)x_{j}^{(i)} \\ &= -\frac{1}{m}\sum_{i=1}^m\left( y^{(i)} - h_\theta(x^{(i)}) \right)x_{j}^{(i)} \\ &= \frac{1}{m}\sum_{i=1}^m\left( h_\theta(x^{(i)}) - y^{(i)} \right)x_{j}^{(i)} \end{align}
梯度下降更新过程如下
\begin{align} \theta_j \mathrel{\mathop:} &= \theta_j - \alpha\frac{\partial}{\partial\theta_j}J(\theta), (j=0 \dots n) \\ \theta_j \mathrel{\mathop:} &= \theta_j - \alpha\frac{1}{m}\sum_{i=1}^m\left( h_\theta(x^{(i)}) - y^{(i)} \right) x_{j}^{(i)}, (j=0 \dots n) \\ \theta_j \mathrel{\mathop:} &= \theta_j - \alpha\sum_{i=1}^m\left( h_\theta(x^{(i)}) - y^{(i)} \right) x_{j}^{(i)}, (j=0 \dots n) \end{align}

上一篇 下一篇

猜你喜欢

热点阅读