【转载】YOLOv2 YOLOv3 如何选择先验框（priors

2019-01-22 本文已影响0人 dopami

在YOLOv2论文中，作者有对Dimension Cluster做一个介绍，这个cluster的目的就是寻找出anchor的先验（简称为先验框）。

什么是先验框呢，简单来说，在YOLOv1中，作者遇到了一个问题，虽然我们通过实验知道要选两个boxes是最优的，但是如何这两个boxes的尺寸如何决定呢？网络自身可以学着不断调节box的大小，但是我们能够提前给定一个/多个尺寸作为备选不是更好吗？所以作者就决定利用 k-means 聚类方法在 training set bounding boxes上来寻找先验框（框的尺寸）。

标准的k-means方法用的是欧氏距离，但是这样会导致 larger boxes generate more error than smaller boxes. 我们想要得到的先验框，是能够带领我们得到更高的IOU的，如果用欧氏距离来衡量，可能会导致“大框优势”。所以作者使用了

来作为k-means中“距离”的判定。

我们期待距离越小越好（IOU越大越好），所以距离判定时候用 1 - IOU

讨论内容见（需要翻墙）：https://groups.google.com/forum/#!topic/darknet/qrcGefJ6d5g

其中，两个po主放出了他们k-means算法的代码：

Jumabek

PaulChongPeng

WillieMaddox

PaulChongPeng的我用VOC2007+2012的training set测试了一下(4/46/2018)，结果如下：

自己加了一个测avg_IOU，最后IOU还可以，但是anchor还是和作者的数据有些差距（据说作者是用VOC+COCO一起做的聚类）

还有一个po主放出了standard k-means的方法（实际是不对的），用的是欧氏距离而非“IOU距离”:

# I wrote up a couple quick scripts to help with this: gen_boxes.sh and cluster_boxes.py.

# They operate within your directory of some_image_name.txt label files.

# Usage example is shown below:

ubuntu@host:~/data/labels$ ls *.txt | head -5

00RaKqC3eqjWCHQMIPKaeNMsdivO83GL.txt

016rMzBciA5V4SjFsCAQ1do8klvl4CWt.txt

01dfRsndtBz67TK80LCH0NAseYwh7md6.txt

03hptnlIR8YB0dNUqZ4AC9gVkwtDb5DZ.txt

04HPP8cRl0wFL0tPEpNCmwrDW74kByKB.txt

ubuntu@host:~/data/labels$ head 00RaKqC3eqjWCHQMIPKaeNMsdivO83GL.txt

0 0.502333 0.549333 0.144667 0.137333

ubuntu@host:~/data/labels$ cat gen_boxes.sh

cat *.txt | cut -d' ' -f 4,5 | sed 's/$[^ ]*$ $.*$/\1,\2/g' > boxes.csv

ubuntu@host:~/data/labels$ bash gen_boxes.sh

ubuntu@host:~/data/labels$ cat cluster_boxes.py

from sklearn.cluster import KMeans

import numpy as np

data = np.genfromtxt('boxes.csv', delimiter=',')

print("Example of data:")

print(data[0:10])

print("")

kmeans = KMeans(n_clusters=5, random_state=0).fit(data)

print("Cluster centers:")

print(kmeans.cluster_centers_)

print("")

print("Scaled to [0, 13]:")

print(kmeans.cluster_centers_ * 13)

print("")

print("In Darknet config format:")

def coords(x):

return "%f,%f" % (x[0], x[1])

print("anchors= %s" % " ".join([coords(center) for center in kmeans.cluster_centers_ * 13]))

ubuntu@host:~/data/labels$ python cluster_boxes.py

Example of data:

[[ 0.144667 0.137333]

[ 0.135333 0.240667]

[ 0.145 0.146667]

[ 0.547 0.306667]

[ 0.4 0.241667]

[ 0.137 0.145 ]

[ 0.643 0.356667]

[ 0.147 0.086667]

[ 0.123 0.112 ]

[ 0.202 0.265 ]]

Cluster centers:

[[ 0.1377161 0.13268718]

[ 0.28492789 0.18958423]

[ 0.0663724 0.05359964]

[ 0.48530697 0.496173 ]

[ 0.18765588 0.27052479]]

Scaled to [0, 13]:

[[ 1.79030931 1.72493339]

[ 3.70406262 2.46459499]

[ 0.86284123 0.6967953 ]

[ 6.30899063 6.450249 ]

[ 2.43952643 3.51682231]]

# In Darknet config format:

# anchors= 1.790309,1.724933 3.704063,2.464595 0.862841,0.696795 6.308991,6.450249 2.439526,3.516822

# You can then copy that "anchors= ..." line in place of the existing one in your yolo-whatever.cfg file.

YOLOv3也沿用了YOLOv2中的先验框（anchor），求法相同。

为什么YOLOv2和YOLOv3的anchor大小有明显区别？

在YOLOv2中，作者用最后一层feature map的相对大小来定义anchor大小。也就是说，在YOLOv2中，最后一层feature map大小为13X13，相对的anchor大小范围就在（0x0，13x13]，如果一个anchor大小是9x9，那么其在原图上的实际大小是288x288.

而在YOLOv3中，作者又改用相对于原图的大小来定义anchor，anchor的大小为（0x0，input_w x input_h]。

所以，在两份cfg文件中，anchor的大小有明显的区别。如下是作者自己的解释：

So YOLOv2 I made some design choice errors, I made the anchor box size be relative to the feature size in the last layer. Since the network was down-sampling by 32. This means it was relative to 32 pixels so an anchor of 9x9 was actually 288px x 288px.

In YOLOv3 anchor sizes are actual pixel values. this simplifies a lot of stuff and was only a little bit harder to implement

https://github.com/pjreddie/darknet/issues/555#issuecomment-376190325

---------------------

作者：Pattorio

来源：CSDN

原文：https://blog.csdn.net/Pattorio/article/details/80095511

【转载】YOLOv2 YOLOv3 如何选择先验框（priors

猜你喜欢

热点阅读