|
faster rcnn网络结构详解
caffe网络在线可视化工具(需要翻墙):http://ethereon.github.io/netscope/#/editor
层名+shape:data (1, 3, 600, 989)
层名+shape:im_info (1, 3)
层名+shape:conv1_1 (1, 64, 600, 989)
层名+shape:conv1_2 (1, 64, 600, 989)
层名+shape:pool1 (1, 64, 300, 495)
层名+shape:conv2_1 (1, 128, 300, 495)
层名+shape:conv2_2 (1, 128, 300, 495)
层名+shape:pool2 (1, 128, 150, 248)
层名+shape:conv3_1 (1, 256, 150, 248)
层名+shape:conv3_2 (1, 256, 150, 248)
层名+shape:conv3_3 (1, 256, 150, 248)
层名+shape:pool3 (1, 256, 75, 124)
层名+shape:conv4_1 (1, 512, 75, 124)
层名+shape:conv4_2 (1, 512, 75, 124)
层名+shape:conv4_3 (1, 512, 75, 124)
层名+shape:pool4 (1, 512, 38, 62)
层名+shape:conv5_1 (1, 512, 38, 62)
层名+shape:conv5_2 (1, 512, 38, 62)
层名+shape:conv5_3 (1, 512, 38, 62)
层名+shape:conv5_3_relu5_3_0_split_0 (1, 512, 38, 62)
层名+shape:conv5_3_relu5_3_0_split_1 (1, 512, 38, 62)
层名+shape:rpn/output (1, 512, 38, 62)
层名+shape:rpn/output_rpn_relu/3x3_0_split_0 (1, 512, 38, 62)
层名+shape:rpn/output_rpn_relu/3x3_0_split_1 (1, 512, 38, 62)
层名+shape:rpn_cls_score (1, 18, 38, 62)
层名+shape:rpn_bbox_pred (1, 36, 38, 62)
层名+shape:rpn_cls_score_reshape (1, 2, 342, 62)
层名+shape:rpn_cls_prob (1, 2, 342, 62)
层名+shape:rpn_cls_prob_reshape (1, 18, 38, 62)
层名+shape:rois (300, 5)
层名+shape:pool5 (300, 512, 7, 7)
层名+shape:fc6 (300, 4096)
层名+shape:fc7 (300, 4096)
层名+shape:fc7_relu7_0_split_0 (300, 4096)
层名+shape:fc7_relu7_0_split_1 (300, 4096)
层名+shape:cls_score (300, 21)
层名+shape:bbox_pred (300, 84)
层名+shape:cls_prob (300, 21)
层名+网络W与b:conv1_1 (64, 3, 3, 3) (64,)
层名+网络W与b:conv1_2 (64, 64, 3, 3) (64,)
层名+网络W与b:conv2_1 (128, 64, 3, 3) (128,)
层名+网络W与b:conv2_2 (128, 128, 3, 3) (128,)
层名+网络W与b:conv3_1 (256, 128, 3, 3) (256,)
层名+网络W与b:conv3_2 (256, 256, 3, 3) (256,)
层名+网络W与b:conv3_3 (256, 256, 3, 3) (256,)
层名+网络W与b:conv4_1 (512, 256, 3, 3) (512,)
层名+网络W与b:conv4_2 (512, 512, 3, 3) (512,)
层名+网络W与b:conv4_3 (512, 512, 3, 3) (512,)
层名+网络W与b:conv5_1 (512, 512, 3, 3) (512,)
层名+网络W与b:conv5_2 (512, 512, 3, 3) (512,)
层名+网络W与b:conv5_3 (512, 512, 3, 3) (512,)
层名+网络W与b:rpn_conv/3x3 (512, 512, 3, 3) (512,)
层名+网络W与b:rpn_cls_score (18, 512, 1, 1) (18,)
层名+网络W与b:rpn_bbox_pred (36, 512, 1, 1) (36,)
层名+网络W与b:fc6 (4096, 25088) (4096,)
层名+网络W与b:fc7 (4096, 4096) (4096,)
层名+网络W与b:cls_score (21, 4096) (21,)
层名+网络W与b:bbox_pred (84, 4096) (84,)
前向计算结果blobs_out:
['bbox_pred', 'cls_prob'] (300, 84) (300, 21)
关键层的分析:
layer {
name: 'proposal'
type: 'Python'
bottom: 'rpn_cls_prob_reshape'
bottom: 'rpn_bbox_pred'
bottom: 'im_info'
top: 'rois'
python_param {
module: 'rpn.proposal_layer'
layer: 'ProposalLayer'
param_str: "'feat_stride': 16"
}
}
输入是:
rpn_cls_prob_reshape (1, 18, 38, 62)
rpn_bbox_pred (1, 36, 38, 62)
im_info (1, 3)
输出是:
rois (300, 5)
class ProposalLayer(caffe.Layer):里面有生成9种anchor box
generate_anchors(base_size=16, ratios=[0.5, 1, 2],
scales=np.array([8, 16, 32]))
ratios就是高宽比H/W(或 长宽比) :
# ws 宽度是缩放后面积的开方 因为不一定是矩阵
ws = np.round(np.sqrt(size_ratios))
# 高度=宽度×缩放比例 ratios就是高宽比 np.round是四舍五入
hs = np.round(ws * ratios)
scales是缩放比例:
ws = w * scales
hs = h * scales
layer {
name: "roi_pool5"
type: "ROIPooling"
bottom: "conv5_3"
bottom: "rois"
top: "pool5"
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.0625 # 1/16
}
}输入:
conv5_3 (1, 512, 38, 62)
rois (300, 5)
输出:
pool5 (300, 512, 7, 7)
ROI pooling总结:
(1)用于目标检测任务;
(2)允许我们对CNN中的feature map进行reuse;
(3)可以显著加速training和testing速度;
(4)允许end-to-end的形式训练目标检测系统。
|
|