Paper | Detecting Twenty-thousan

2023-12-13  本文已影响0人  与阳光共进早餐

写在前面

1. Introduction:

  1. OD has two subtasks: 1) finding boxes (localization); 2) naming the boxes (classification)

  2. Previous works couple these two subtasks;

  3. however, the detection benchmarks are much smaller than the classification benchmark;

as in the fig, both the image number and the category number of LVIS (OD) are much smaller than ImageNet (CLS).

image.png

This paper:

propose a detector with image classes (Detic) that uses image-level supervision in addition to detection supervision.

illustration:

image.png

standard OD: need gt boxes and labels;

weakly supervised od: assign image-level labels to predicted boxes [error-prone]

this paper: assigns image-level labels to the max-size proposals.

2 Method

2.1 preliminary

tradional OD: C_{test} =C_{det},D_{cls} = \phi $

OVD: allows C_{test} \neq C_{det}

2.2 Detic

the whole idea is quite simple.

image.png
  1. sample a batch from both D_{det} and D_{cls}.

  2. if image belongs to D_{det}, then loss = typical od loss, rpn loss + rg loss + cls loss

  3. if image belongs to D_{cls}, then loss = max-size loss, max-size means the proposal has the max size is finally regarded as the region, then used to caculate the cls loss.

image.png
上一篇 下一篇

猜你喜欢

热点阅读