Rich feature hierarchies for accurate object detection and semantic segmentation
R-CNN: Regions with CNN features
待解决问题:采用深度网络定位物体;只用少量的物体检测标注数据训练高性能模型
物体定位问题定义:回归问题;滑动窗口方法(sliding-window approach)
Object detection with R-CNN
物体检测系统包含三个模块:生成与类别无关的候选区域;CNN从每个区域提取固定长度的特征向量;特定类别的线性SVMs集合
Module design
Region proposals:采用选择性搜索方法
Feature extraction:采用AlexNet(5个卷积层,2个全连接层)从每个region提取4096维特征。
- 统一regions尺寸为227 x 227:不考虑候选区域的size和ratio,直接将将最逼近的bounding box里的所有像素
Features are computed by forward propagating a
mean-subtracted 227 x 227 RGB image through five convolutional
layers and two fully connected layers.
Test-Time测试
- 在测试图像中选择性搜索,提取大约2000个region proposals
- - code