Andrew Ng ML(1)——basic knowledge

2018-12-19  本文已影响0人  tmax

introduction


univariate (one variable) linear regressing (supervised learning)

Hypothesis: h_{\Theta}(x)=\Theta_0 +\Theta_1x
Parameters:\Theta_{i's}
cost function:J(\Theta_0,\Theta_1)=\frac {1} {2m}\sum_1^m (h_{\Theta}(x^{(i)})-y^{(i)})^2 (←this is a square error function,also the most commonly used one for regression problems)
goal: \displaystyle minimize_{\Theta_0,\Theta_1} \ J(\Theta_0,\Theta_1)

simplify hypothesis as h_{\Theta}(x)=\Theta_1x
\Downarrow

each value of Theta1 corresponds a different hypothesis

hypothesis as h_{\Theta}(x)=\Theta_0+\Theta_1x
\Downarrow

cost function(function J) when the hypothesis have two parameters Right:contour plot(等高线图) of cost function

"Batch"Gradient descent("Batch"梯度下降) with one variable

Batch:每一步梯度下降均用到了整个样本(J(\Theta_0 ,\Theta_1)中有对均方误差的累加
have some functions J(\Theta_0 ,\Theta_1.... \Theta_n)
want minJ(\Theta_0 ,\Theta_1.... \Theta_n)
outline:1.start with some \Theta_0 ,\Theta_1.... \Theta_n(commonly they are all zeros) 2.keep changing \Theta_0 ,\Theta_1.... \Theta_n to reduce J(\Theta_0 ,\Theta_1.... \Theta_n) until we hopefully end up at a mininum

Gradient descent algorithm (P.S. := 表示赋值, = 表示比较,需要同时更新两个parameters)
simplify hypothesis as 梯度下降公式中,导数项的含义 α的取值对梯度下降的影响(如果Θ已经取到局部最小值,由于导数项为0,解将一直保持在局部最小值)

simplify hypothesis as h_{\Theta}(x)=\Theta_0+\Theta_1x
\Downarrow

cost function and Gradient descent algorithm when hypothesis have two parameters 导数项计算 将导数项代会上图中的梯度下降算法
最后,将梯度下降算法中得到的parameters\Theta_0,\Theta_1代入h_{\Theta}(x),就能得到最优解线性拟合函数


Matrices and vectors(回顾)

calculate all of predicted prices at the same time(单个假设函数)
\Downarrow
Houses sizes:
2104
1416
1534
852

hypothesis:
h_\Theta(x)=-40+0.25x

\begin{bmatrix} 1&2104\\ 1&1416\\ 1&1534\\ 1&852\\ \end{bmatrix}* \begin{bmatrix} -40\\ 0.25 \end{bmatrix}= \begin{bmatrix} -40*1+2104*0.25\\ ...\\ ...\\ -40*1+852*0.25 \end{bmatrix}
(prediction = DataMatrix * parameters)

多个假设函数
\Downarrow

多假设函数
上一篇下一篇

猜你喜欢

热点阅读