k-means聚类出anchors
参考: YOLOv3使用笔记——Kmeans聚类计算anchor boxes - Gotta-C的博客 - 博客 https://blog.csdn.net/cgt19910923/article/details/82154401
原工程:https://github.com/lars76/kmeans-anchor-boxes
文章目录
k-means聚类出anchors程序`example.py`中`load_dataset`后的数据长这样子:`kmeans.py`
我的实验数据集为`VOC 2007`数据为自己的测试原论文的聚类结果的表现
YOLOv3中的9个anchor(在yolov3-voc.cfg中,anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326)是由作者通过聚类COCO数据集得到的。而VOC数据集包含20类目标,其中大到bicycle、bus,小到bird、cat,目标大小差距很大,而如果将其用在自己的数据集上检测目标,其中部分anchor并不合理。因此在自己的数据集上聚类计算anchor,提高bounding box的检出率。
Joseph Redmon论文数据avg iou在67.2,该作者验证在k=9时,多次迭代在VOC 2007数据集上得到avg iou在67.13,相差无几。
程序
example.py中load_dataset后的数据长这样子:
data = load_dataset(ANNOTATIONS_PATH)
data:
[[0.41643059 0.262 ]
[0.97450425 0.972 ]
[0.20298507 0.202 ]
...
[0.498 0.992 ]
[0.284 0.424 ]
[0.99465241 0.994 ]]
解析:(举例:[0.41643059 0.262 ]的由来) [0.41643059 0.262 ]对应000001.jpg的第一个目标。 原始标注数据情况: 000001.jpg图片的尺寸为353*500; 该图中第一个目标的位置信息为:
<bndbox>
<xmin>48</xmin>
<ymin>240</ymin>
<xmax>195</xmax>
<ymax>371</ymax>
</bndbox>
所以(195-48) / 353 = 0.4164305949008499,(371-240) / 500 = 0.262
kmeans.py
import numpy
as np
def iou(box
, clusters
):
"""
Calculates the Intersection over Union (IoU) between a box and k clusters.
:param box: tuple or array, shifted to the origin (i. e. width and height)# zul: 把box移到原点,也就是width和height是相对于原点的值。详情可以见下文的解析。
:param clusters: numpy array of shape (k, 2) where k is the number of clusters
:return: numpy array of shape (k, 0) where k is the number of clusters
"""
x
= np
.minimum
(clusters
[:, 0], box
[0])
y
= np
.minimum
(clusters
[:, 1], box
[1])
if np
.count_nonzero
(x
== 0) > 0 or np
.count_nonzero
(y
== 0) > 0:
raise ValueError
("Box has no area")
intersection
= x
* y
box_area
= box
[0] * box
[1]
cluster_area
= clusters
[:, 0] * clusters
[:, 1]
iou_
= intersection
/ (box_area
+ cluster_area
- intersection
)
return iou_
def avg_iou(boxes
, clusters
):
"""
Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters.
:param boxes: numpy array of shape (r, 2), where r is the number of rows
:param clusters: numpy array of shape (k, 2) where k is the number of clusters
:return: average IoU as a single float
"""
return np
.mean
([np
.max(iou
(boxes
[i
], clusters
)) for i
in range(boxes
.shape
[0])])
def translate_boxes(boxes
):
"""
Translates all the boxes to the origin.
:param boxes: numpy array of shape (r, 4)
:return: numpy array of shape (r, 2)
"""
new_boxes
= boxes
.copy
()
for row
in range(new_boxes
.shape
[0]):
new_boxes
[row
][2] = np
.abs(new_boxes
[row
][2] - new_boxes
[row
][0])
new_boxes
[row
][3] = np
.abs(new_boxes
[row
][3] - new_boxes
[row
][1])
return np
.delete
(new_boxes
, [0, 1], axis
=1)
def kmeans(boxes
, k
, dist
=np
.median
):
"""
Calculates k-means clustering with the Intersection over Union (IoU) metric.(~~?未解,使用IOU度量标准去计算k-means聚类~~)
:param boxes: numpy array of shape (r, 2), where r is the number of rows
:param k: number of clusters
:param dist: distance function(~~?距离函数是代表什么~~)
:return: numpy array of shape (k, 2)
"""
rows
= boxes
.shape
[0]
distances
= np
.empty
((rows
, k
))
last_clusters
= np
.zeros
((rows
,))
np
.random
.seed
()
clusters
= boxes
[np
.random
.choice
(rows
, k
, replace
=False)]
while True:
for row
in range(rows
):
distances
[row
] = 1 - iou
(boxes
[row
], clusters
)
nearest_clusters
= np
.argmin
(distances
, axis
=1)
if (last_clusters
== nearest_clusters
).all():
break
for cluster
in range(k
):
clusters
[cluster
] = dist
(boxes
[nearest_clusters
== cluster
], axis
=0)
last_clusters
= nearest_clusters
return clusters
其中的计算交集解析
x = np.minimum(clusters[:, 0], box[0]) y = np.minimum(clusters[:, 1], box[1]) intersection = x * y # 计算交集
我的实验
数据集为VOC 2007
运行zul_example.py
Accuracy: 67.32%
Boxes:
[[0.06 0.184 ]
[0.038 0.068 ]
[0.564 0.424 ]
[0.81333333 0.81791045]
[0.184 0.17866667]
[0.114 0.32 ]
[0.344 0.706 ]
[0.098 0.09866667]
[0.24 0.38666667]]
Ratios:
[0.33, 0.36, 0.49, 0.56, 0.62, 0.99, 0.99, 1.03, 1.33]
解析:输出结果中的Ratios为box的长宽比。
例如:0.06 / 0.184 = 0.32608695652173914
数据为自己的
修改标签文件夹Annotations路径。
结果是:
【敏感部分打马
测试原论文的聚类结果的表现
clusters = [[10,13],[16,30],[33,23],[30,61],[62,45],[59,119],[116,90],[156,198],[373,326]]
out= np.array(clusters)/416.0
结果是:
【敏感部分打马】