检测预训练的Point-Level区域对比学习
Point-Level Region Contrast for Object Detection Pre-Training
原文:https://arxiv.org/abs/2202.04639
CVPR 2022 (Oral)
In this work we present point-level region contrast, a self-supervised pre-training approach for the task of object detection. This approach is motivated by the two key factors in detection: localization and recognition. While accurate localization favors models that operate at the pixel- or point-level, correct recognition typically relies on a more holistic, region-level view of objects. Incorporating this perspective in pre-training, our approach performs contrastive learning by directly sampling individual point pairs from different regions. Compared to an aggregated representation per region, our approach is more robust to the change in input region quality, and further enables us to implicitly improve initial region assignments via online knowledge distillation during training. Both advantages are important when dealing with imperfect regions encountered in the unsupervised setting. Experiments show point-level region contrast improves on state-of-the-art pre-training methods for object detection and segmentation across multiple tasks and datasets, and we provide extensive ablation studies and visualizations to aid understanding. Code will be made available.
在这项工作中,我们提出了点级区域对比度,这是一种用于目标检测任务的自监督预训练方法。这种方法的动机是检测中的两个关键因素:定位和识别。虽然精确的定位有利于在像素或点级别上运行的模型,但正确的识别通常依赖于更全面的区域级别的对象视图。将这一观点纳入预训练中,我们的方法通过直接从不同区域采样单个点对来执行对比学习。与每个区域的聚合表示相比,我们的方法对输入区域质量的变化更为稳健,并进一步使我们能够在培训期间通过在线知识蒸馏隐式地改进初始区域分配。在处理无监督环境中遇到的不完美区域时,这两个优点都很重要。实验表明,针对多个任务和数据集的目标检测和分割,点级区域对比度改进了最先进的预训练方法,我们提供了广泛的消融研究和可视化,以帮助理解。将提供代码。