LIMSVM python接口使用说明(1)
1 安装
<span id="1">1</span>
对于linux系统,在github上下载文件,网址:https://github.com/cjlin1/libsvm。
在主文件目录下打开terminal,输入sudo make。另外安装gnuplot, 输入sudo apt-get install gnuplot。
2 应用函数
<h2 id="2">
输入>>> from svmutil import *
上述命令导入如下函数
- svm_train() : 训练一个 SVM model
- svm_predict() : 预测测试数据
- svm_read_problem() : 从一个LIBSVM格式的文件读取数据
- svm_load_model() : 导入一个 LIBSVM model.
- svm_save_model() : save model to a file.
- evaluations() : 评价预测结果
2.1 svm_train
<h2 id="3">
有三种方式调用svm_train()
>>> model = svm_train(y, x [, 'training_options'])
>>> model = svm_train(prob [, 'training_options'])
>>> model = svm_train(prob, param)
参数
- y: 训练标签的一个list/turple(数据类型不必须是int/double)。如:[1,-1]
- x: 训练样本的list/turple
- training_options: string
-s svm_type : set type of SVM (default 0)
0 -- C-SVC (multi-class classification)
1 -- nu-SVC (multi-class classification)
2 -- one-class SVM
3 -- epsilon-SVR (regression)
4 -- nu-SVR (regression)
-t kernel_type : set type of kernel function (default 2)
0 -- linear: u'*v
1 -- polynomial: (gamma*u'*v + coef0)^degree
2 -- radial basis function: exp(-gamma*|u-v|^2)
3 -- sigmoid: tanh(gamma*u'*v + coef0)
4 -- precomputed kernel (kernel values in training_set_file)
-d degree : set degree in kernel function (default 3)
-g gamma : set gamma in kernel function (default 1/num_features)
-r coef0 : set coef0 in kernel function (default 0)
-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
-m cachesize : set cache memory size in MB (default 100)
-e epsilon : set tolerance of termination criterion (default 0.001)
-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)
-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)
-v n: n-fold cross validation mode
-q : quiet mode (no outputs)
- prob: 用svm_problem(y, x)创建
- param: 用svm_parameter('training_options')创建
- model: 返回的svm_model实例
例子
>>> prob = svm_problem(y, x)
>>> param = svm_parameter('-s 3 -c 5 -h 0')
>>> m = svm_train(y, x, '-c 5')
>>> m = svm_train(prob, '-t 2 -c 5')
>>> m = svm_train(prob, param)
>>> CV_ACC = svm_train(y, x, '-v 3')
2.2 svm_predict
<h2 id="4">
用一个训练好的model预测测试数据,使用:
>>> p_labs, p_acc, p_vals = svm_predict(y, x, model [,'predicting_options'])
参数
-
y: 真值标签的list/tuple。当真值未知时使用$$[0]*len(x) $$
-
x: 预测样本的list/tuple
-
model: an svm_model instance.
-
predicting_options: -b probability_estimates: whether to predict probability estimates, 0 or 1 (default 0); for one-class SVM only 0 is supported
-
p_labels: 预测得到的标签
-
p_acc: a tuple including accuracy (for classification), mean squared error, and squared correlation coefficient (for regression).
-
p_vals: a list of decision values or probability estimates(当'-b 1'被指定时)。如果k是训练数据的类别数,对于decision values来说,每个元素包含预测的$$k(k-1)/2$$个2-类SVM的结果。对分类来说,$$k=1$$是一个特殊情况,对每个测试数据返回decision value [+1],而不是一个空list。
对于概率来说,每个元素包含k个值,表示测试样本属于个个类的概率。
例子
>>> p_labels, p_acc, p_vals = svm_predict(y, x, m)
2.3 svm_read_problem/svm_load_model/svm_save_model
<h2 id="5">
>>> y, x = svm_read_problem('data.txt')
>>> m = svm_load_model('model_file')
>>> svm_save_model('model_file', m)
2.4 evaluations
<h2 id="6">
评价分类结果
>>> (ACC, MSE, SCC) = evaluations(ty, pv)
参数
- ty:真值标签的list
- pv:预测值的标签
- ACC:精度
- MSE:mean squared error.
- SCC:squared correlation coefficient.