Libra R-CNN: Towards Balanced Learning for Object Detection_区块链

论文基本信息标题： Libra R-CNN: Towards Balanced Learning for Object Detection作者： JiangmiaoPang, Kai Chen, Jianping Shi, Huajun Feng, Wanli Ouyang, Dahua Lin机构： Zhejiang University, The Chinese University of Hong Kong, SenseTime Research, The
University of Sydney来源： CVPR2019时间： 2019/04/04链接： https://arxiv.org/abs/1904.02701代码： https://github.com/open-mmlab/mmdetection（official code） https://github.com/OceanPang/Libra_R-CNN 背景/问题当前多数detector（one-stage和two-stage）的training 模式： sampling regionsextracting features from regions基于multi-task objective function同时识别category和细化location training paradigm中的3个key aspect selected region samples是否representativeextracted visual features是否被充分利⽤designed objective function是否最优

如上图所⽰，训练过程中存在3种影响性能的imbalance。（分别对应3个key aspect）sample level
motivation：hard example可以有效提⾼detector的性能，但随机采样得到的example通常以
easy example为主。
--------已有方法及其问题：
OHEM：通过置信度（confidence）选择hard sample，但其对noise label很敏感，并会产
⽣较⼤的内存占⽤和计算成本
Focal loss：解决One-stage算法中的foreground-background class imbalance，仅适⽤于
One-stage detector，对RCNN作⽤不⼤，因为⼤量easy negative都被two-stage procedure
过滤掉了feature level
motivation：高分辨率的低层特征图回归精度会很高但是很难做分类，高层的语义信息分类精度很高但只适合检测大目标。同时利⽤⽹络浅/深层的descriptive information和semantic information有助于
OD，那最优的integrate这2种information的⽅式是什么呢？
--------已有方法及其问题：
FPN、PANet，两者采取的sequential manner使得integrated features更多地关注于相邻层，⽽较少关注其它层，每次fusion时⾮相邻层中的semantic information就会稀释1次。objective level
motivation：
①训练过程中，easy sample产⽣的small gradient可能会被hard example产⽣的large
gradient压倒，进⽽影响模型性能
②detection包括classification和localization这2个任务，如果没有得到适当的平衡，则其中1
个任务可能会受到影响，进⽽影响整体性能方法/研究内容提出3个component分别减少上述3种level的imbalance IoU-balanced sampling：根据sample与其被分配到的GT的IoU，来挖掘hard examplebalanced feature pyramid：一个特征金字塔的变种，同时将所有level（⽽不仅是相邻level）的feature聚合，并⽤其增强各个levelbalanced L1 loss：outlier梯度较⼤⽽inlier梯度较⼩，因此对outlier（回归error⼤于等于1的
sample）产⽣的large gradient进⾏clip（使gradient最⼤为1），对inlier（回归error⼩于1的
sample，accurate sample）的gradient进⾏加强，使得classification、overall localization和
accurate localization得到平衡性能/效果 COCO上，AP⽐FPN Faster RCNN⾼2.5个point，⽐RetinaNet⾼2个point基于FPN Faster RCNN、detectron中的1× schedule，backbone使⽤ResNet-50、ResNeXt-101-
64x4d时，AP分别为38.7和43.0 算法细分

IoU-balanced Sampling hard negative：⽬标检测中，我们已知GT BBox，算法会⽣成⼀系列proposal，其中有些proposal和GT BBox重合程度⾼、有些和GT BBox重合程度低。与GT BBox重合程度（IoU）超过⼀定阈值（通常0.5）的proposal则被认定为positive eaxmple，重合程度低于该阈值的则是negative example，然后将
positive example和negative example扔进⽹络中训练。然⽽，⼀个问题是positive example的数量远远少于negative example，这样训练出来的分类器的效果总是有限的，会出现许多false positive，其中得分较⾼的false positive就是所谓的hard negative example，比如置信度为0.9的误检。目的：hard negative example是主要的problem，我们希望能多采样⼀些hard negative。
随机采样：

①观察橙柱可以看出，在hard negative example中，有60%多的example与其对应GT BBox的
overlap超过0.05
②观察蓝柱可以看出，随机采样得到的training sample中，仅有30%多的example与其对应GT
BBox的overlap超过0.05。
③随机采样得到的training sample的分布和hard negative example的真实分布是不同的，使得上千
个easy example才有1个hard example。IoU-balanced sampling：

如图3绿柱所⽰，IoU-balanced sampling使得training sample的分布较接近于hard negative example的真实分布。K默认为3，同时实验证明，只要IoU更⾼的negative sample更易被选择，则性能对K值的变化并不敏感如何处理positive example：
IoU-balanced sampling其实也适⽤于hard positive example，但现实中往往没有⾜够的sampling
candidate能将IoU-balanced sampling扩展到hard positive example，因此本⽂为每个GT BBox采样
等量的positive sample，来作为1种替代⽅法。Balanced Feature Pyramid 核心思路：同时将所有level的feature聚合并⽤其增强各个level

①Integrate：

1 ⾸先通过interpolation和max pooling将各个level的feature缩放⾄C4的size
2 然后将各level的feature的平均值作为integrated feature
②refine：
1 使⽤Non-local neural networks⼀⽂中的embedded Gaussian non-local attention对integrated feature进⾏refine
2 然后使⽤interpolation和max pooling将integrated feature逆向变换⾄各level的scale得到与FPN
相同的输出
3 Balanced Feature Pyramid可以和FPN、PAFPN（PANet）是兼容的

Balanced L1 Loss：
smooth L1 loss：smoth L1 loss是1个location loss，来⾃于Fast RCNN：

问题在于：outlier（可视为hard example）的gradient较⼤，⽽inlier（可视为easy example）的
gradient较⼩。
具体分析如下：
error⼤于等于1的sample称为outlier，error⼩于1的sample称为inlier
由于unbounded regression targets，直接提⾼localization loss的权重会使模型对outlier更加敏感
这些outlier，可以看作是hard example，会产⽣过⼤的梯度，对训练过程有害
与outlier相⽐，inlier可以看作是easy example，对整体梯度的贡献很⼩
更具体地说，与outlier相⽐，inlier平均对每个sample仅贡献30%的梯度
balanced L1 loss的效果：

balanced L1 loss的核⼼思路：
outlier梯度较⼤⽽inlier梯度较⼩，因此对outlier（error⼤于等于1的sample）产⽣的large gradient
进⾏clip（使gradient最⼤为1，如图5(a)虚线），对inlier（error⼩于1的sample，accurate
sample）的gradient进⾏加强，使得classification、overall localization和accurate localization得
到平衡。
使⽤balanced L1 loss的location loss：

x、y、w、h的balanced L1 loss之和，即x的balanced L1 loss+y的balanced L1 loss+······
由上式5可知，location loss关于模型参数的梯度正⽐于balanced L1 loss关于x（或y、w、h）的梯
度，即：

balanced L1 loss的梯度设计：

参数 α 控制inlier的gradient， α 越⼩则inlier的gradient就越⼤；
参数 γ 控制regression error的上界，可以使得classification和localization这2个task达到平衡；
参数 b ⽤来处理 x = 1 的情况。
balanced L1 loss的定义：
实验：

欢迎分享，转载请注明来源：内存溢出

原文地址: https://outofmemory.cn/zaji/1498972.html

Libra R-CNN: Towards Balanced Learning for Object Detection

发表评论

评论列表（0条）