论文笔记:R-FCN: Object Detection via Region-based Fully Convolutional Network

前提

ResNet做classification问题,效果很好。但是不能直接用到detection问题中去。作者认为这是分类问题的平移不变性以及检测问题的平移变换性导致的。

We propose position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection.

网络结构

论文笔记:R-FCN: Object Detection via Region-based Fully Convolutional Network_第1张图片

简而言之,R-FCN是RPN+classification network, classifactrion network是如下结构:
ResNet + position-sensitive score maps + position-sensitive RoI pooling

The k2position-sensitive scores then vote on the ROI. In this paper we simply vote by averaging the scores, producing a (C+1)-dimensional vector for each ROI: rc(θ)=i,jrc(i,j|θ) . Then we compute the softmax responses across categories: sc(θ)=erc(θ)/Ci=0eri(θ) . They are used for evaluating the cross-entropy loss during training and for ranking the ROIs during inference.

优点

  1. All learnable weight layers are convolutional and are computed on the entire image; the per-RoI computational cost is negligible.
    这里写图片描述

  2. Receiving arbitrary sizes of image
    remove fully connected layer. 这个是极好的,一直觉得SPPNet还有ROI pooling其实还是有误差的,有压缩的。

  3. position-sensitive map做了类似CRAFT的工作与无形之中,针对每个类单独pooling,提高精度

  4. 3.3x3的vote机制,增加了鲁棒性。因为是针对一个物体进行二分类(是或者否)而不是进行全物体分类,所以3x3就挺好的了。

你可能感兴趣的:(deep-learning,paper,深度学习,目标检测,神经网络,论文笔记)