ResNeXt里组卷积是什么？

2019-12-29 本文已影响0人 Jinglever

转载请注明出处：https://www.jianshu.com/p/328b29d20403 如果觉得有用，麻烦点个赞噢~

在ResNeXt里，用到了一个结构叫组卷积，如下图所示：

blocks of ResNeXt

在Pytorch的卷积函数（如：Conv2d）里，有一个参数叫group，它实现的就是组卷积逻辑。下面看官方对这个参数的解释：

:attr:groups controls the connections between inputs and outputs.
:attr:in_channels and :attr:out_channels must both be divisible by
:attr:groups. For example,
* At groups=1, all inputs are convolved to all outputs.
* At groups=2, the operation becomes equivalent to having two conv
layers side by side, each seeing half the input channels,
and producing half the output channels, and both subsequently
concatenated.
* At groups= :attr:in_channels, each input channel is convolved with
its own set of filters, of size:
:math: $\left\lfloor\frac{out\_channels}{in\_channels}\right\rfloor$ .

大概意思就是：将输入的通道分组，每组都跟各自的卷积核做卷积计算，计算结果拼接在一起作为输出。其中，每个分组的卷积核channel数等于out_channels // groups。

在ResNeXt的语境里，组卷积的group数就是cardinalities C，而每个分组的卷积核的通道数是width of bottleneck d，组卷积输出的channel数就是width of group conv。见下图：

(out_channels = C * d)

再来看论文里展示的ResNeXt-50(32x4d)的网络结构：

可见，ResNeXt的几个bottleneck的卷积核channel数（128 -> 256 -> 512 -> 1024）的递增倍数跟ResNet的（64 -> 128 -> 256 -> 512）一致。在代码里要怎么写呢？

参考torchvision里resnet模型的源码，可以看到有这样的式子：

 width = int(planes * (base_width / 64.)) * groups

width是作为out_channels传给bottleneck内的卷积计算的。乍一看是去，不太好理解这个式子。我改成下面的写法：

width = int(base_width * groups * (planes/64.))

意思是否更加明确些了呢？对于第一个bottleneck，ResNeXt-50(32x4d)的卷积核channels等于128，即width = base_width * groups = 4 * 32 = 128。从第二个bottleneck开始，width要成倍递增，倍数跟ResNet-50的相同，即plaines/64.，于是就有了上面计算width的式子。

ResNeXt里组卷积是什么？

猜你喜欢

热点阅读