caffe-深度学习

caffe:使用VGGNet训练自己的数据集

2018-03-16  本文已影响664人  运动小爽

下面记录在ubuntu16.04中,使用自己的图像分类数据集来训练VGGNet-16的整个流程。

第一步:准备自己的图像分类数据集

准备过程参考:caffe数据集格式转换——图像格式到LMDB/LEVELDB

最终得到了vgg_train_lmdb和vgg_val_lmdb两个文件夹。

在caffe/models/目录下新建一个名为vggnet的文件夹,并将vgg_train_lmdb和vgg_val_lmdb两个文件夹放到caffe/models/vggnet/目录下:


数据集准备完成。

第二步:准备VGGNet-16模型文件vggnet_train_val.prototxt

从caffe官网的Model Zoo中可以找到VGG16的deploy文件:VGG_ILSVRC_16_layers_deploy.prototxt

可能是该模型文件创建的时间比较早,模型中很多命名方式都不太习惯,比如里面的layer type全部是大写字母,新版的caffe,layer type命名方式一般只是首字母大写;另外还有一些和/caffe/models/目录下提供的caffe模型代码风格不一样,于是自己根据VGGNet论文内容,并结合自己的数据集,将vggnet_train_val.prototxt重新写了一下:

vggnet_train_val.prototxt:

name: "VGGNet"
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    crop_size: 224
    mean_value: 104
    mean_value: 117
    mean_value: 123
    mirror: true
  }
  data_param {
    source: "models/vggnet/vgg_train_lmdb" #注意训练集文件的路径
    batch_size: 32  #训练批次大小根据自己的显卡显存而定,我开始设为64导致out of memory,于是改成32
    backend: LMDB
  }
}
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    crop_size: 224
    mean_value: 104
    mean_value: 117
    mean_value: 123
    mirror: false
  }
  data_param {
    source: "models/vggnet/vgg_val_lmdb" #注意验证集文件的路径
    batch_size: 32
    backend: LMDB
  }
}
layer {
  name: "conv1_1"
  type: "Convolution"
  bottom: "data"
  top: "conv1_1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 64
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1_1"
  type: "ReLU"
  bottom: "conv1_1"
  top: "conv1_1"
}

layer {
  name: "conv1_2"
  type: "Convolution"
  bottom: "conv1_1"
  top: "conv1_2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 64
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1_2"
  type: "ReLU"
  bottom: "conv1_2"
  top: "conv1_2"
}

layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1_2"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}

layer {
  name: "conv2_1"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2_1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 128
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu2_1"
  type: "ReLU"
  bottom: "conv2_1"
  top: "conv2_1"
}
layer {
  name: "conv2_2"
  type: "Convolution"
  bottom: "conv2_1"
  top: "conv2_2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 128
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu2_2"
  type: "ReLU"
  bottom: "conv2_2"
  top: "conv2_2"
}

layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2_2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}

layer {
  name: "conv3_1"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3_1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 256
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu3_1"
  type: "ReLU"
  bottom: "conv3_1"
  top: "conv3_1"
}

layer {
  name: "conv3_2"
  type: "Convolution"
  bottom: "conv3_1"
  top: "conv3_2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 256
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu3_2"
  type: "ReLU"
  bottom: "conv3_2"
  top: "conv3_2"
}

layer {
  name: "conv3_3"
  type: "Convolution"
  bottom: "conv3_2"
  top: "conv3_3"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 256
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu3_3"
  type: "ReLU"
  bottom: "conv3_3"
  top: "conv3_3"
}

layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3_3"
  top: "pool3"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}

layer {
  name: "conv4_1"
  type: "Convolution"
  bottom: "pool3"
  top: "conv4_1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu4_1"
  type: "ReLU"
  bottom: "conv4_1"
  top: "conv4_1"
}

layer {
  name: "conv4_2"
  type: "Convolution"
  bottom: "conv4_1"
  top: "conv4_2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu4_2"
  type: "ReLU"
  bottom: "conv4_2"
  top: "conv4_2"
}

layer {
  name: "conv4_3"
  type: "Convolution"
  bottom: "conv4_2"
  top: "conv4_3"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu4_3"
  type: "ReLU"
  bottom: "conv4_3"
  top: "conv4_3"
}

layer {
  name: "pool4"
  type: "Pooling"
  bottom: "conv4_3"
  top: "pool4"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}

layer {
  name: "conv5_1"
  type: "Convolution"
  bottom: "pool4"
  top: "conv5_1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu5_1"
  type: "ReLU"
  bottom: "conv5_1"
  top: "conv5_1"
}

layer {
  name: "conv5_2"
  type: "Convolution"
  bottom: "conv5_1"
  top: "conv5_2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu5_2"
  type: "ReLU"
  bottom: "conv5_2"
  top: "conv5_2"
}

layer {
  name: "conv5_3"
  type: "Convolution"
  bottom: "conv5_2"
  top: "conv5_3"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu5_3"
  type: "ReLU"
  bottom: "conv5_3"
  top: "conv5_3"
}

layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5_3"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}

layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 1
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10              #注意将fc8层改成自己的图像类别数目
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc8"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8"
  bottom: "label"
  top: "loss"
}

第三步:准备VGGNet-16模型求解器文件:vggnet_solver.prototxt

这一步比较简单,参考其他模型的solver文件即可:

vggnet_solver.prototxt:

net: "models/vggnet/vggnet_train_val.prototxt"
test_iter: 10
test_interval: 500  #每经过500次训练,进行一次验证查看accuracy
base_lr: 0.01
lr_policy: "step"
gamma: 0.1
stepsize: 1000
display: 20
max_iter: 2000  #只是做做练习,2000次迭代就够了
momentum: 0.9
weight_decay: 0.0005
snapshot: 1000  #每经过1000次迭代训练保存一次快照
snapshot_prefix: "models/vggnet/vggnet_train"
solver_mode: GPU

第四步:准备VGGNet-16模型训练shell脚本

编写train_vggnet.sh脚本文件。

train_vggnet.sh:

#!/usr/bin/env sh
set -e

./build/tools/caffe train \
    --solver=models/vggnet/vggnet_solver.prototxt $@

所有的准备工作完成!当然你得先配置好caffe-GPU环境。现在/caffe/models/vggnet/目录下有这些文件:

第五步:执行训练

在caffe/目录下打开bash,执行以下命令:

bash ./models/vggnet/train_vggnet.sh

下面是训练过程中打印出的验证集的Accuracy情况:

I0316 22:12:22.387151 12448 solver.cpp:351] Iteration 0, Testing net (#0)
I0316 22:12:23.484271 12448 solver.cpp:418]     Test net output #0: accuracy = 0.10625
...
I0316 22:15:08.977102 12448 solver.cpp:351] Iteration 500, Testing net (#0)
I0316 22:15:10.174433 12448 solver.cpp:418]     Test net output #0: accuracy = 0.340625
...
I0316 22:17:58.450613 12448 solver.cpp:351] Iteration 1000, Testing net (#0)
I0316 22:17:59.427794 12448 solver.cpp:418]     Test net output #0: accuracy = 0.359375
...
I0316 22:20:45.280406 12448 solver.cpp:351] Iteration 1500, Testing net (#0)
I0316 22:20:46.432967 12448 solver.cpp:418]     Test net output #0: accuracy = 0.459375
...
I0316 22:23:34.968350 12448 solver.cpp:351] Iteration 2000, Testing net (#0)
I0316 22:23:35.955927 12448 solver.cpp:418]     Test net output #0: accuracy = 0.51875

最后:结果评估和总结

从打印出的Accuracy结果来看,模型顺利地开始训练,并且随着训练次数的增多,模型的top-1准确率在慢慢提升。说明整个分类模型训练的准备流程是没问题的。

实际上,由于训练数据实在很少,只准备了ImageNet2012图像分类数据集大小的1/100,加上VGGNet-16差不多1.4亿的模型参数,不难猜想模型肯定会出现严重的过拟合现象,不过没关系,这次练习的重点是掌握使用自己的图像数据集来完成图像分类模型训练的所有流程。

总结一下,整个分类模型包含以下步骤:

./build/tools/caffe train \
    --solver=models/vggnet/vggnet_solver.prototxt
上一篇 下一篇

猜你喜欢

热点阅读