图像评估算法 - NIMA

2019-02-14 本文已影响7人 SpikeKing

在视觉业务场景中，对于用户上传的图像，经常需要给予一个模糊的评分，用于推荐或者画像。这就涉及到如何评估图像的好坏。

NIMA

本文介绍一篇，2018年TIP的一篇文章（IEEE Transactions on Image Processing 2018），NIMA: Neural Image Assessment，基于神经网络的图像评估：

数据集

数据集：AVA，Aesthetic Visual Analysis，已标注。

标签如下：

11 953417 0 0 0 5 32 50 23 10 3 1 22 0 1396
12 953777 0 3 2 3 13 40 35 21 8 3 20 53 1396
13 953756 0 2 3 9 35 50 20 5 2 2 0 0 1396
14 954195 0 1 7 26 56 23 6 1 0 2 0 0 1396
15 953903 0 1 4 5 33 50 17 9 3 2 21 28 1396
16 954222 0 1 2 4 18 41 29 17 10 4 9 24 1396
17 953889 0 1 0 9 20 40 29 14 11 6 0 0 1396
18 953844 0 0 1 6 40 51 20 6 4 0 15 19 1396
19 954104 0 0 2 7 35 47 19 8 5 0 0 0 1396
20 954229 0 0 2 10 50 42 12 4 3 1 0 0 1396

其中：

第1维是图像索引；
第2维是图像ID；
第3-12维是图像评分的分布，每张图片约210个人评分，平均1~10分的人数，即The number of votes per image ranges from 78 to 549, with an average of 210 votes；
第13-14维是两个类别，tags，参考tags文档，可能有0~1个类别，用0标记补全；
第15维，挑战赛的ID，25w张图片，来源于1447个挑战赛；即We created AVA by collecting approximately 255,000 images covering a wide variety of subjects on 1,447 challenges；

标签分布如下：

Distribution

训练模型

GitHub：neural-image-assessment

训练，参考train_mobilenet.py

MobileNet的改进：

输入图像是224x224x3；
MobileNet去掉全连接层；
全局池化使用AveragePooling；
增加Dropout层，0.75，增加10个评分数据的回归；
优化算法Adam；
损失函数earth_mover_loss；
mobilenet预训练参数；

GAP

EMD (earth mover’s distance) loss，推土机距离，详解：

import numpy as np
import tensorflow as tf
from keras import backend as K


def main():
    arr1 = np.array([[0., 1., 2.], [0., 1., 2.]])
    arr2 = np.array([[1., 3., 5.], [2., 4., 6.]])
    print(arr1, arr2)
    
    sess = tf.Session()
    
    cdf_ytrue = K.cumsum(arr1, axis=-1)
    cdf_ypred = K.cumsum(arr2, axis=-1)
    
    v_cdf_ytrue = sess.run(cdf_ytrue)
    v_cdf_ypred = sess.run(cdf_ypred)
    print(v_cdf_ytrue, v_cdf_ypred)

    samplewise_emd = K.sqrt(K.mean(K.square(K.abs(cdf_ytrue - cdf_ypred)), axis=-1))

    v_samplewise_emd = sess.run(samplewise_emd)
    print(v_samplewise_emd)

    loss = K.mean(samplewise_emd)
    
    v_loss = sess.run(loss)
    print(v_loss)


def earth_mover_loss(y_true, y_pred):
    cdf_ytrue = K.cumsum(y_true, axis=-1)
    cdf_ypred = K.cumsum(y_pred, axis=-1)
    samplewise_emd = K.sqrt(K.mean(K.square(K.abs(cdf_ytrue - cdf_ypred)), axis=-1))
    return K.mean(samplewise_emd)


if __name__ == '__main__':
    main()

EMD公式：

EMD公式

推理

预测图像：

图像预处理；
计算期望和方差，期望是得分，方差是歧义度；

改进

数据集与真实数据差距较大，数据集偏摄影作品，真实数据为用户实拍。标注真实数据集，SABC四级分类。
增加垃圾数据的得分为0分，如违规图片，证件等；
增加清晰度和颜值等辅助相关；
AVA数据集打分集中在4-5分，输出分数区分度不大。
改进：融合其他数据集和美图数据，使模型输出区分增大。

参考1，参考2，参考3，EMD参考4

图像评估算法 - NIMA

数据集

训练模型

推理

改进

猜你喜欢

热点阅读