GhostNet

2020-03-07 本文已影响0人海盗船长_coco

论文地址：https://arxiv.org/abs/1911.11907
Pytorch实现代码：https://github.com/iamhankai/ghostnet.pytorch

灵感来源

作者发现将ResNet-50第一层残差网络的feature map可视化后，发现有大量冗余相似的feature map，这些冗余特征可以保证对输入数据有全面的理解。下图中配对的feature map就非常相近，当前情况用feature map A经过卷积后得到feature map B。
现在产生的想法是通过一系列固有特征(intrinsic feature maps)，也就是feature map A。不经过卷积操作，而是使用更加廉价的线性运算得到大量的feature map B。这些大量的特征更有助于揭示固有特征的信息。

可视化后的feature map

GhostModule

论文中提出新的模块，叫做GhostModule。既然普通卷积提取的feature map有大量冗余的部分。那么只需生成少数固有特征feature map A，再用feature map A通过廉价的线性变化得到大量的feature map B不就行了。
红色部分就是提取那些固有特征feature map A的过程。而绿色部分表示利用固有特征进行线性变换得到大量的feature map B，该操作针对的是单个通道，类似深度可分离卷积中的深度卷积。最后参考ResNet的skip connect，进行通道维度的叠加。Output的通道数是固有特征的通道数的s倍，故剩余其他部分是固有特征的(s-1)倍。s为一个超参数，下图为不同的s的参数量和准确率。

GhostModule

不同s的权重和准确率
实现代码:黄色部分指红框中的固定特征，红色部分指Output底下的feature map。

class GhostModule(nn.Module):
    def __init__(self, in_channel, out_channel, kernel_size=1, ratio=2, dw_size=3, stride=1, relu=True):
        super(GhostModule, self).__init__()
        self.out_channel = out_channel
        mid_channel = math.ceil(out_channel / ratio)  # 黄色部分通道数
        rest_channels = mid_channel * (ratio - 1)  # 剩余红色部分通道数
        # 生成黄色部分
        self.primary_conv = nn.Sequential(
            nn.Conv2d(in_channel, mid_channel, kernel_size, stride, kernel_size // 2, bias=False),
            nn.BatchNorm2d(mid_channel),
            nn.ReLU6(inplace=True)
        )
        # 黄色部分提取剩余红色部分
        self.cheap_operation = nn.Sequential(
            nn.Conv2d(mid_channel, rest_channels, dw_size, 1, dw_size // 2, groups=mid_channel, bias=False),
            nn.BatchNorm2d(rest_channels)
        )
        self.bn = nn.BatchNorm2d(out_channel)
        if relu:
            self.relu = nn.ReLU6(inplace=True)
        else:
            self.relu = None

    def forward(self, x):
        x1 = self.primary_conv(x)
        x2 = self.cheap_operation(x1)
        out = torch.cat([x1, x2], dim=1)
        out = out[:, :self.out_channel, :, :]
        out = self.bn(out)
        if self.relu != None:
            out = self.relu(out)
        return out

参数量和计算量与标准卷积核进行比较：

Ghost module与普通卷积核
其中卷积核大小KxK与dxd大小相近，即K约等于d。且比值S远小于C，那么两者的参数量对比为

Ghost module与普通卷积核的参数量对比

Ghost bottleneck

Ghost bottleneck由GhostModule组成，并且也有残差网络的skip connect的部分。但是downsample采用的是深度可分离卷积，不是ResNet的传统卷积核。

Ghost bottleneck
代码部分：

class GhostBottleneck(nn.Module):
    def __init__(self, in_channel, mid_channel, out_channel, kernel_size, stride, use_se):
        super(GhostBottleneck, self).__init__()
        assert stride in [1, 2]

        self.conv = nn.Sequential()
        self.conv.add_module('GhostModule1', GhostModule(in_channel, mid_channel, kernel_size=1, relu=True))
        if stride == 2:
            self.conv.add_module('DWconv', depthwise_conv(mid_channel, mid_channel, kernel_size, stride,
                                                          relu=False) if stride == 2 else nn.Sequential())
        if use_se:
            self.conv.add_module('se block', SE_block(mid_channel))
        self.conv.add_module('GhostModule2', GhostModule(mid_channel, out_channel, kernel_size=1, relu=False))

        if stride == 1 and in_channel == out_channel:
            self.downsample = None
        else:
            self.downsample = nn.Sequential(
                depthwise_conv(in_channel, in_channel, 3, stride, relu=True),
                nn.Conv2d(in_channel, out_channel, kernel_size=3, stride=stride, padding=1, bias=False),
                nn.BatchNorm2d(out_channel),
            )

    def forward(self, x):
        if self.downsample == None:
            return self.conv(x) + x
        else:
            return self.conv(x) + self.downsample(x)

最终的主体网络其实就是由一个个Ghost bottleneck搭建而成