Fast_R-CNN

The Fast R-CNN method has several advantages:

  1. Higher detection quality (mAP) than R-CNN, SPPnet
  2. Training is single-stage, using a multi-task loss
  3. Training can update all network layers
  4. No disk storage is required for feature caching

Architecture

Fast_R-CNN_第1张图片
image

Region of interest pooling — description

The layer takes two inputs:

  • A fixed-size feature map obtained from a deep convolutional network with several convolutions and max pooling layers.
  • An N x 5 matrix of representing a list of regions of interest, where N is a number of RoIs. The first column represents the image index and the remaining four are the coordinates of the top left and bottom right corners of the region.

For every region of interest from the input list, it takes a section of the input feature map that corresponds to it and scales it to some pre-defined size (e.g., 7×7). The scaling is done by:

  1. Dividing the region proposal into equal-sized sections (the number of which is the same as the dimension of the output)
  2. Finding the largest value in each section
  3. Copying these max values to the output buffer

refs

  1. https://deepsense.ai/region-of-interest-pooling-explained/
  2. https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/object_localization_and_detection.html
  3. https://arxiv.org/pdf/1504.08083.pdf

你可能感兴趣的:(Fast_R-CNN)