LIMSVM python接口使用说明(1)

2017-10-19 本文已影响0人 Banach_J

1.安装
2.应用函数

1 安装

<span id="1">1</span>
对于linux系统，在github上下载文件，网址：https://github.com/cjlin1/libsvm。
在主文件目录下打开terminal，输入sudo make。另外安装gnuplot，输入sudo apt-get install gnuplot。

2 应用函数

<h2 id="2">
输入>>> from svmutil import *

上述命令导入如下函数

svm_train() : 训练一个 SVM model
svm_predict() : 预测测试数据
svm_read_problem() : 从一个LIBSVM格式的文件读取数据
svm_load_model() : 导入一个 LIBSVM model.
svm_save_model() : save model to a file.
evaluations() : 评价预测结果

2.1 svm_train

<h2 id="3">
有三种方式调用svm_train()

    >>> model = svm_train(y, x [, 'training_options'])
    >>> model = svm_train(prob [, 'training_options'])
    >>> model = svm_train(prob, param)

参数

y: 训练标签的一个list/turple(数据类型不必须是int/double)。如：[1,-1]
x: 训练样本的list/turple
training_options: string

-s svm_type : set type of SVM (default 0)
    0 -- C-SVC      (multi-class classification)
    1 -- nu-SVC     (multi-class classification)
    2 -- one-class SVM  
    3 -- epsilon-SVR    (regression)
    4 -- nu-SVR     (regression)
-t kernel_type : set type of kernel function (default 2)
    0 -- linear: u'*v
    1 -- polynomial: (gamma*u'*v + coef0)^degree
    2 -- radial basis function: exp(-gamma*|u-v|^2)
    3 -- sigmoid: tanh(gamma*u'*v + coef0)
    4 -- precomputed kernel (kernel values in training_set_file)
-d degree : set degree in kernel function (default 3)
-g gamma : set gamma in kernel function (default 1/num_features)
-r coef0 : set coef0 in kernel function (default 0)
-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
-m cachesize : set cache memory size in MB (default 100)
-e epsilon : set tolerance of termination criterion (default 0.001)
-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)
-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)
-v n: n-fold cross validation mode
-q : quiet mode (no outputs)

prob: 用svm_problem(y, x)创建
param: 用svm_parameter('training_options')创建
model: 返回的svm_model实例

例子

    >>> prob = svm_problem(y, x)
    >>> param = svm_parameter('-s 3 -c 5 -h 0')
    >>> m = svm_train(y, x, '-c 5')
    >>> m = svm_train(prob, '-t 2 -c 5')
    >>> m = svm_train(prob, param)
    >>> CV_ACC = svm_train(y, x, '-v 3')

2.2 svm_predict

用一个训练好的model预测测试数据，使用：
>>> p_labs, p_acc, p_vals = svm_predict(y, x, model [,'predicting_options'])

参数

y: 真值标签的list/tuple。当真值未知时使用$$[0]*len(x) $$
x: 预测样本的list/tuple
model: an svm_model instance.
predicting_options: -b probability_estimates: whether to predict probability estimates, 0 or 1 (default 0); for one-class SVM only 0 is supported
p_labels：预测得到的标签
p_acc: a tuple including accuracy (for classification), mean squared error, and squared correlation coefficient (for regression).
p_vals: a list of decision values or probability estimates(当'-b 1'被指定时)。如果k是训练数据的类别数，对于decision values来说，每个元素包含预测的$$k(k-1)/2$$个2-类SVM的结果。对分类来说，$$k=1$$是一个特殊情况，对每个测试数据返回decision value [+1]，而不是一个空list。

对于概率来说，每个元素包含k个值，表示测试样本属于个个类的概率。

例子

    >>> p_labels, p_acc, p_vals = svm_predict(y, x, m)

2.3 svm_read_problem/svm_load_model/svm_save_model

    >>> y, x = svm_read_problem('data.txt')
    >>> m = svm_load_model('model_file')
    >>> svm_save_model('model_file', m)

2.4 evaluations

<h2 id="6">
评价分类结果
>>> (ACC, MSE, SCC) = evaluations(ty, pv)

参数

ty:真值标签的list
pv:预测值的标签
ACC：精度
MSE：mean squared error.
SCC：squared correlation coefficient.