AI深度学习

[yolo] - 如何理解yolo(Darknet)的cfg文件

2018-12-13  本文已影响534人  phoenixmy

yolo的cfg文件内容比较丰富,可以用于配置很多网络参数,暂时我还未发现有特别详细的介绍,根据网络上零星的描述,现整理如下:

来自darknet原著作者的解释

  1. saturation, exposure and hue values - ranges for random changes of colours of images during training (params for data augumentation), in terms of HSV: https://en.wikipedia.org/wiki/HSL_and_HSV
    The larger the value, the more invariance would neural network to change of lighting and color of the objects.

  2. steps and scales values - steps is a checkpoints (number of itarations) at which scales will be applied, scales is a coefficients at which learning_rate will be multipled at this checkpoints.
    Determines how the learning_rate will be changed during increasing number of iterations during training.

  3. anchors, bias_match
    anchors are frequent initial <width,height> of objects in terms of output network resolution.
    bias_match used only for training, if bias_match=1 then detected object will have <width,height> the same as in one of anchor, else if bias_match=0 then <width,height> of anchor will be refined by a neural network:

    darknet/src/region_layer.c

    Lines 275 to 283 in c190406

    | | box pred = get_region_box(l.output, l.biases, n, index, i, j, l.w, l.h); |
    | | if(l.bias_match){ |
    | | pred.w = l.biases[2n]; |
    | | pred.h = l.biases[2
    n+1]; |
    | | if(DOABS){ |
    | | pred.w = l.biases[2n]/l.w; |
    | | pred.h = l.biases[2
    n+1]/l.h; |
    | | } |
    | | } |

    If you train with height=416,width=416,random=0, then max values of anchors will be 13,13.
    But if you train with random=1, then max input resolution can be 608x608, and max values of anchors can be 19,19.

  4. jitter, rescore, thresh
    jitter can be [0-1] and used to crop images during training for data augumentation. The larger the value of jitter, the more invariance would neural network to change of size and aspect ratio of the objects:

    darknet/src/data.c

    Lines 513 to 528 in c190406

    | | int dw = (owjitter); |
    | | int dh = (oh
    jitter); |
    | | |
    | | int pleft = rand_uniform(-dw, dw); |
    | | int pright = rand_uniform(-dw, dw); |
    | | int ptop = rand_uniform(-dh, dh); |
    | | int pbot = rand_uniform(-dh, dh); |
    | | |
    | | int swidth = ow - pleft - pright; |
    | | int sheight = oh - ptop - pbot; |
    | | |
    | | float sx = (float)swidth / ow; |
    | | float sy = (float)sheight / oh; |
    | | |
    | | int flip = random_gen()%2; |
    | | image cropped = crop_image(orig, pleft, ptop, swidth, sheight); |

rescore determines what the loss (delta, cost, ...) function will be used - more about this: #185 (comment)

darknet/src/region_layer.c

Lines 302 to 305 in c190406

| | l.delta[best_index + 4] = l.object_scale * (1 - l.output[best_index + 4]) * logistic_gradient(l.output[best_index + 4]); |
| | if (l.rescore) { |
| | l.delta[best_index + 4] = l.object_scale * (iou - l.output[best_index + 4]) * logistic_gradient(l.output[best_index + 4]); |
| | } |

thresh is a minimum IoU when should be used delta_region_class() during training:

darknet/src/region_layer.c

Line 235 in c190406

| | if (best_iou > l.thresh) { |


  1. object_scale, noobject_scale, class_scale, coord_scale values - all used for training

void delta_region_class(float *output, float *delta, int index, int class, int classes, tree *hier, float scale, float *avg_cat) |

float delta_region_box(box truth, float *x, float *biases, int n, int index, int i, int j, int w, int h, float *delta, float scale) |

  1. absolute - isn't used

来自Stack Overflow上的解释

Here is my current understanding of some of the variables. Not necessarily correct though:

[net]

On the left we have a single channel with 4x4 pixels, The reorganization layer reduces the size to half then creates 4 channels with adjacent pixels in different channels.

figure

layers

Many things are more or less self-explanatory (size, stride, batch_normalize, max_batches, width, height). If you have more questions, feel free to comment.

Again, please keep in mind that I am not 100% certain about many of those.

以上内容摘抄自:
https://github.com/AlexeyAB/darknet/issues/279
https://stackoverflow.com/questions/50390836/understanding-darknets-yolo-cfg-config-files

上一篇下一篇

猜你喜欢

热点阅读