LOSS函数总结

2019-04-04 本文已影响0人 LuDon

LOSS 函数

loss函数一般分为两个部分：误差项和正则化项。
$J(w) = \sum L(f_i(w)) + \lambda R(w)$
其中误差项一般有一下几种：

0-1损失
Hinge（SVM）
Log （逻辑回归，交叉熵）
Square loss （线性回归）
exponential loss （boosting）

1、0-1损失

记录错误分类的次数。
$L(f)=\begin{cases} 0 ,\quad f >= 0 \\\\ 1,\quad f < 0 \end{cases}$

2、hinge loss

常用语最大间隔分类。
$L(f) = max(0, 1-t.f)$
$f$ 为原始的输出，而不是预测的类别标签。 $t$ 为期望的类别标签。

def hinge(pred_label, actual_label):
       if 1-pred_label*actual_label > 0:
               return 1-pred_label*actual_label
       return 0

3、Log loss

$P(y=1|x,\theta) = h_{\theta}(x), P(y=0|x,\theta) = 1 - h_{\theta}(x)$ ，则 $P(y|x,\theta) = h_{\theta}(x)^y(1 - h_{\theta}(x))^{1-y}$ ，loss函数定义为：
$L(\theta) = P(y|X,\theta) = \Pi^m_{i=1}P(y^{(i)}|x^{(i)}, \theta)$
$= \Pi^m_{i=1}h_{\theta}(x^{(i)})^{y^{(i)}}(1 - h_{\theta}(x^{(i)}))^{1-y^{(i)}}$
极大化最大似然函数：
$l(\theta) = max log(L(\theta))$
$= max \sum {y^{(i)}}log h_{\theta}(x^{(i)})+ 1-y^{(i)}log(1 - h_{\theta}(x^{(i)}))$
$= min - \sum {y^{(i)}}log h_{\theta}(x^{(i)})+ 1-y^{(i)}log(1 - h_{\theta}(x^{(i)}))$
这也是最小化交叉熵。
熵表示信息的不确定性，熵越大，表示信息的不确定性程度越大。熵的定义：
$H(X) = - \sum p(x)logp(x)$

def cross_entropy(pred_label, actual_label):
    res = 0
    for i in range(len(pred_label)):
        res += actual_label[i] logpred_label + (1-actual_label[i] log(1-pred_label[i]))
    return -res

4、square loss

预测值与实际值差值的平方和：
$L_2(m) = (f_w(x) - y)^2$
均方差（L2）是度量预测值与实际值差的平方的均值。只考虑到误差的大小，而没有考虑到方向。

def rmse(pred_label, actual_label):
      e = pred_label-actual_label
      e_s = d**2
      mse = e_s.mean()
      rmse = np.sqrt(mse)
      return rmse

$L_1(M) = |(f_w(x) - y)|$
平均绝对误差（L1）是度量预测值与实际值之间绝对差之和的平均值。对异常鲁邦。

def rmse(pred_label, actual_label):
      e = pred_label-actual_label
      e_a = np.absolute(d)
      mae = e_s.mean()
      return mae

5、exponential loss

指数误差一般用在boosting中。
$J(w) = \lambda R(w) + \sum exp(-yf_w(x))$