序列最小最优算法(SMO算法)

2019-03-10  本文已影响0人  shenghaishxt

本文来自我的个人博客 https://www.zhangshenghai.com/posts/3981/

序列最小最优化算法(sequential minimal optimization, SMO)算法:1998年由Platt提出。支持向量机的学习问题可以形式化为求解凸二次规划问题,这样的凸二次规划问题具有全局最优解,并且有许多最优化算法可以用于这一问题的求解。但是当训练样本容量很大时,这些算法往往变得非常低效,以致无法使用。所以,如何高效地实现支持向量机学习就成为一个重要的问题。

SMO算法是一种启发式算法,基本思路如下:

SMO算法包括两个部分:

序列最小最优化(sequential minimal optimization,SMO)算法要解如下凸二次规划的对偶问题:
\begin{align*} \\ & \min_{\alpha} \dfrac{1}{2} \sum_{i=1}^{N} \sum_{j=1}^{N} \alpha_{i} \alpha_{j} y_{i} y_{j} K \left( x_{i}, x_{j} \right) - \sum_{i=1}^{N} \alpha_{i} \\ & s.t. \sum_{i=1}^{N} \alpha_{i} y_{i} = 0 \\ & 0 \leq \alpha_{i} \leq C , \quad i=1,2, \cdots, N \end{align*}

子问题有两个变量,一个是违反KKT条件最严重的那个,另一个由约束条件自动确定。子问题的两个变量只有一个是自由变量,如果\alpha_2确定,那么\alpha_1也随之确定,所以子问题中同时更新两个变量。

SMO算法

输入:训练数据集T = \left\{ \left( x_{1}, y_{1} \right), \left( x_{2}, y_{2} \right), \cdots, \left( x_{N}, y_{N} \right) \right\},其中x_{i} \in \mathcal{X} = R^{n}, y_{i} \in \mathcal{Y} = \left\{ +1, -1 \right\}, i = 1, 2, \cdots, N,精度\varepsilon

输出:近似解\hat \alpha​

  1. 取初始值\alpha^{0} = 0​,令k = 0​

  2. 选取优化变量\alpha_{1}^{\left( k \right)},\alpha_{2}^{\left( k \right)}​,求解
    \begin{align*} \\ & \min_{\alpha_{1}, \alpha_{2}} W \left( \alpha_{1}, \alpha_{2} \right) = \dfrac{1}{2} K_{11} \alpha_{1}^{2} + \dfrac{1}{2} K_{22} \alpha_{2}^{2} + y_{1} y_{2} K_{12} \alpha_{1} \alpha_{2} \\ & \quad\quad\quad\quad\quad\quad - \left( \alpha_{1} + \alpha_{2} \right) + y_{1} \alpha_{1} \sum_{i=3}^{N} y_{i} \alpha_{i} K_{i1} + y_{2} \alpha_{2} \sum_{i=3}^{N} y_{i} \alpha_i K_{i2} \\ & s.t. \quad \alpha_{1} + \alpha_{2} = -\sum_{i=3}^{N} \alpha_{i} y_{i} = \varsigma \\ & 0 \leq \alpha_{i} \leq C , \quad i=1,2 \end{align*}

  3. 求得最优解\alpha_{1}^{\left( k+1 \right)},\alpha_{2}^{\left( k+1 \right)},更新\alpha\alpha^{\left( k+1 \right)}

  4. 若在精度\varepsilon​范围内满足停机条件

\sum_{i=1}^{N} \alpha_{i} y_{i} = 0

0 \leq \alpha_{i} \leq C, i = 1, 2, \cdots, N

\begin{align} y_{i} \cdot g \left( x_{i} \right) = \left\{ \begin{aligned} \ & \geq 1, \left\{ x_{i} | \alpha_{i} = 0 \right\} \\ & = 1, \left\{ x_{i} | 0 < \alpha_{i} < C \right\} \\ & \leq 1, \left\{ x_{i} | \alpha_{i} = C \right\} \end{aligned} \right.\end{align}

其中,
g(x_i) = \sum_{j=1}^N \alpha_jy_jK(x_j,x_i) + b
则转(4);否则令k = k + 1,转(2);

4.取\hat \alpha = \alpha^{\left( k + 1 \right)}​

上一篇下一篇

猜你喜欢

热点阅读