小窥sklearn.metrics中的F1-score指标

2020-08-17  本文已影响0人  井底蛙蛙呱呱呱

Note: 本文以二分类为例。

F1-score是用来综合评估分类器召回(recall)和精确率(precision)的一个指标,其公式为:

其中,
recall = TPR = TP/(TP+FN);
precision = PPV = TP/(TP+FP)

sklearn.metrics.f1_score中存在一个较为复杂的参数是average,其有多个选项——None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’。下面简单对这些参数进行解释:

一个小栗子

下面是一个而分类模型的不同参数的F1-score值:

print('None:', f1_score(y_true, y_hat, average=None))
print('binary:', f1_score(y_true, y_hat, average='binary'))
print('micro:', f1_score(y_true, y_hat, average='micro'))
print('macro:', f1_score(y_true, y_hat, average='macro'))
print('weighted:', f1_score(y_true, y_hat, average='weighted'))
# 输出
None: [ 0.674  0.744]
binary: 0.744013475371028
micro: 0.713331633953607
macro: 0.7091534012629005
weighted: 0.7110354720495692

macro = (0.674 + 0.744)/2  # = 0.709

# 两个类别的样本数:label-0 = 37112, label-1 = 41348, 验证 weighted f1-score
0.674*37112/(37112+41348) + 0.744*41348/(37112+41348) # 0.710889 由于小数保留位数问题,稍有区别

寻找二分类最佳 binary(positive label) F1-score阈值

对于二分类问题,可以使用sklearn.metrics.precision_recall_curve来得到各个概率阈值下的precision和recall,但是需要注意的是这时候只计算 positive label 的recall 和 precision, 若只关注positive label的召回和精确率,则可以使用该API返回的thresh及其对应的recall和precision来得到最佳划分阈值。

precs, recs, thrs = precision_recall_curve(y_true, y_prob)
f1s = 2 * precs * recs / (precs + recs)
best_thresh = thrs[np.argmax(f1s)]

需要注意的是此时仅关注positive label的f1-score,若正负样本的f1-score都关注,则此时求出来的阈值对于正负两个类别的综合f1-score并非最佳的。

附:利用阈值来找到最佳准确率

from sklearn.metrics import accuracy_score
from sklearn.metrics import roc_curve

fpr, tpr, thresholds = roc_curve(y_true, probs)
accuracy_scores = []
for thresh in thresholds:
    accuracy_scores.append(accuracy_score(y_true, 
                                         [1 if m > thresh else 0 for m in probs]))

accuracies = np.array(accuracy_scores)
max_accuracy = accuracies.max() 
max_accuracy_threshold =  thresholds[accuracies.argmax()]
上一篇下一篇

猜你喜欢

热点阅读