Omics Academy

Note: Introduction to Machine Le

2020-11-11  本文已影响0人  OmicsAcademy
image.png

Recommended book: Mathematics for Machine learning (free online): https://mml-book.github.io/

image.png image.png image.png image.png

Simple linear regression

Simple linear regression is a simple term.

Loss function:
\mathcal{L}=\frac{1}{n} \sum_{i=1}^{n}\left(y_{i}-f\left(x_{i}\right)\right)^{2}=\frac{1}{n} \sum_{i=1}^{n}\left(y_{i}-\beta_{0}-\beta_{1} x_{i}\right)^{2}

Baby linear regression

Consider slope only model:
f(x)=\beta x

The loss:
\mathcal{L}=\frac{1}{n} \sum_{i=1}^{n}\left(y_{i}-f\left(x_{i}\right)\right)^{2}=\frac{1}{n} \sum_{i=1}^{n}\left(y_{i}-\beta_{0}-\beta_{1} x_{i}\right)^{2}

image.png

How to find \beta.

Baby gradient descent

image.png

Have a initial guess, based on the derivative. We can guess which direction to go.

Update rule:
\beta \leftarrow \beta-\eta \frac{d \mathcal{L}(\beta)}{d \beta}

Here \eta is the learning rate.

It's tricky to find global minimum for non-convex function.

If you learning rate is too large:

Baby gradient descent cont.

We need to compute the derivative of the loss:
\mathcal{L}(\beta)=\frac{1}{n} \sum_{i}\left(y_{i}-\beta x_{i}\right)^{2}

We get:

\mathcal{L}^{\prime}(\beta)=\frac{1}{n} \sum_{i} 2\left(y_{i}-\beta x_{i}\right)\left(-x_{i}\right)=-\frac{2}{n} \sum_{i} x_{i}\left(y_{i}-\beta x_{i}\right)

Baby analytical solution

At the minimum:

\sum_{i} x_{i} y_{i}-\hat{\beta} \sum_{i} x_{i}^{2}=0

We obtain:

\hat{\beta} =\frac{\sum_{i} x_{i} y_{i}}{\sum_{i} x_{i}^2}

Back to simple linear regression

\mathcal{L}\left(\beta_{0}, \beta_{1}\right)=\frac{1}{n} \sum_{i=1}^{n}\left(y_{i}-\beta_{0}-\beta_{1} x_{i}\right)^{2}

Introducing partial derivatives

You first fix \beta.

Update rules for each parameter:

\begin{array}{l} \beta_{0} \leftarrow \beta_{0}-\eta \frac{\partial \mathcal{L}}{\partial \beta_{0}} \\ \beta_{1} \leftarrow \beta_{1}-\eta \frac{\partial \mathcal{L}}{\partial \beta_{1}} \end{array}

In vector form:

\vec{\beta} \leftarrow \vec{\beta}-\eta \nabla \mathcal{L}

Compute the gradient

\mathcal{L}=\frac{1}{n} \sum_{i=1}^{n}\left(y_{i}-\beta_{0}-\beta_{1} x_{i}\right)^{2}

We need partial derivatives:

\begin{aligned} \frac{\partial \mathcal{L}}{\partial \beta_{0}} &=-\frac{2}{n} \sum\left(y_{i}-\beta_{0}-\beta_{1} x_{i}\right) \\ \frac{\partial \mathcal{L}}{\partial \beta_{1}} &=-\frac{2}{n} \sum\left(y_{i}-\beta_{0}-\beta_{1} x_{i}\right) x_{i} \end{aligned}

Reference

https://www.youtube.com/watch?v=lWGdFeMsjzg&feature=youtu.be

上一篇 下一篇

猜你喜欢

热点阅读