医学影像AI·医疗大数据·智慧互联医疗认知科学探索与创新深度学习·神经网络·计算机视觉

Real 3D / Volumetric CNN for me

2017-01-09  本文已影响630人  MrGiovanni

Author: Zongwei Zhou | 周纵苇
Weibo: @MrGiovanni
Email: zongweiz@asu.edu
原文链接: http://zongwei.leanote.com/post/3D


Reviews

[1] Automatic Detection of Cerebral Microbleeds From MR Images via 3D Convolutional Neural Networks. paper

1. Screening strategy > conventional sliding window strategy.相当于一个3D的fully convolutional networks,把3D的数据输入,输出一个3D的score map。这样来初步找到可能目标的坐标点集(Region of Interest, ROI),其中会包含很多false positive,不过这也比扫描高效很多。问题是从TABLE 1看,这个网络结构并不像Fully Convolutional Networks啊,更像是个普通的分类网络。不知道作者是如何得到score map的。

THE ARCHITECTURE OF 3D FCN SCREENING MODELTHE ARCHITECTURE OF 3D FCN SCREENING MODEL

2. Discrimination stage removes large number of false positive candidates. 相当于一个3D的CNN,用来检测3D patch。ReLU is utilized in the C and FC layer.
3D CNN architecture details: The 3D convolution kernels are randomly initialized form the Gaussian distribution (Learning from Scratch), opimizer is SGD, loss funciton is cross entropy loss. Meanwhile, dropout strategy is utilized. lr=0.03, momentum=0.9, dropout rate=0.3, batch size=100.

512 $\times$ 512 $\times$ 150 image $\longrightarrow$ 3D FCN $\longrightarrow$ 512 $\times$ 512 $\times$ 150 score map $\longrightarrow$ threshold ($\mathcal{T}$ = 0.64) $\longrightarrow$ 20 $\times$ 20 $\times$ 16 patch $\longrightarrow$ 3D CNN $\longrightarrow$ labeled.

1. 3D FCN better than these two methods - Barnes et al. and Chen et al.

COMPARISION OF DIFFERENT SCREENING METHODSCOMPARISION OF DIFFERENT SCREENING METHODS

2. Good detection performance

EVALUATION OF DETECTION RESULTSEVALUATION OF DETECTION RESULTS
FROC COMPARISONFROC COMPARISON
对比到对象是Bames et al,random forest和2D-CNN-SVM。

3. Capability of intermediate FEATURE representation better.

FEATURE REPRESENTATIONFEATURE REPRESENTATION
这个对比还是很新奇的,使用的工具是t-SNE toolbox.

[2] Multi-level Contextual 3D CNNs for False Positive Reduction in Pulmonary Nodule Detection. paper

1. Multi-level contextual receptive field.

FUSION OF THREE 3D CNNsFUSION OF THREE 3D CNNs
实质上是融合了三个不同的3D CNN的预测结果,这三个网络是根据不同尺寸的input patch来训练得到的,也就是说“多尺度”的CNN。。。好吧,理论上的优点是既用到了局部的细节特征,又用到了全局的特征。这个方法我们曾经有想过,也有很多研究者在2D上做过这个。对于多尺度问题,需要定义“尺度”的大小,所以作者就对数据集做了统计分析,如下图
DISTRIBUTION ANALYSIS OF THE SIZES OF PULMONARY NODULES FOR DETERMINING RECEPTIVE FIELDS.DISTRIBUTION ANALYSIS OF THE SIZES OF PULMONARY NODULES FOR DETERMINING RECEPTIVE FIELDS.
这个多尺度的划分方法感觉是比较原始的,在实际应用中可参考性不佳,因为需要对数据集做一个统计,而选取的样本是否有统计代表性,要是来了新的数据是否还适用,都是不确定的。作者用的是voxels来标定的,首先来说我认为可以改成绝对的尺度(mm)。

2. Multi-model fusion
接下来看三个3D网络的融合过程,三个网络结构如表

THE ARCHITECTURE OF DIFFERENT RECEPTIVE FIELD 3D CNNTHE ARCHITECTURE OF DIFFERENT RECEPTIVE FIELD 3D CNN
Fuse the softwax regression outputs (probabilities) from all networks. The fused posterior probability $P_{fusion}$ is estimated by weighted linear combination:
$$P_{fusion}=\sum_{i\in{1,2,3}}\gamma_i\cdot P_i$$
The constant weight $\gamma_i$ were determined using grid search on a small subset of the training data in our experiments ($\gamma_1=0.3$, $\gamma_2=0.4$, $\gamma_3=0.3$).
这个融合其实并没有在网络内部进行融合,只是对于输出的概率做了一个简单的融合,这个是表面上的“融合”。对于融合,还有更多的方法,如拼接三个CNN的全连接层来融合,一个思想是把back propagation机制放在融合的过程中,这才是我比较认同的融合。

我觉得这部分是比较有参考价值:
The challenge evaluated detection results by measuring the detection sensitivity and average false positive rate per scan. A predicted candidate location was counted as a true positive if it was located within the radius of a true nodule center.(对于True Positive的定义对于画FROC是很关键的) Detections of irrelevant findings were ignored (i.e., considered as neither false positives nor true positives) in the evaluation. The challenge organizers performed the free receiver operation characteristic (FROC) analysis by setting different thresholds on the raw prediction probabilities submitted by the participating teams. The evaluation also computed the 95% confidence interval using the bootstrapping [36]. A competition performance metric (CPM) score [37], which was calculated as the average sensitivity at seven predefined false positive rates: 1/8, 1/4, 1/2, 1, 2, 4 and 8 false positives per scan, was produced for each algorithm. The ten-fold cross validation on the dataset was specified.

1. 3D > 2D

3D vs 2D CNN detection3D vs 2D CNN detection

2. Fusion multi-level > single level

FROC ANALYSIS FOR DIFFERENT LEVELFROC ANALYSIS FOR DIFFERENT LEVEL

在论文的最后作者给出了3D的卷积核的可视化图,我不清楚放这个有什么用,能说明什么结果?

[3] 3D Deeply Supervised Network for Automatic Liver Segmentation from CT Volumes. paper

这篇文章给我的感觉就是一个3D的HED (paper),或者说一个3D Fully Convolutional Networks (paper),来对比一下它们的网络结构:

3D DSN3D DSN
HEDHED
FCNFCN
都是结合中间层的输出map,来做最后的分割预测,这个结构当时给我的疑问是如何设计back propagation,还有怎么把各个中间层结合起来,加权的权重是怎么学习出来的,是否也要放到back propagation中去?

1. vanishing gradients problem
文中提到来梯度消失的问题,在3D的网络中可能会更加严重。解决方案是用多个中间层的预测输出来设计Loss,
$$\mathcal{L}=\mathcal{L}{o}(\mathcal{X};W)+\sum{\eta_h\cdot\mathcal{L}{h}(\mathcal{X};W_h,w_h)}+[regularization]$$
用权重$\eta_h$来控制各个隐层的重要性,从而解决前面几层的梯度消失,这个我个人认为不是很站的住脚,原因是一旦出现梯度消失,这个梯度是很小的,大概就是可以认为是0,那么要乘一个很大很大的权重才可以把数值拉上来,即使这样,其实并没有根本解决梯度消失。另外,ReLU的提出好像就是为了解决这个问题的,我不确定如果在3D中用这个激活函数还需不需要考虑梯度消失问题。

2. 条件随机场(CRF)模型
这个就很拼学术功底了,也是我为什么感觉自己的本科学历不够用的重要原因,正常情况下,我是不可能会想到要用这个模型来优化结果的。文章中的篇幅很小,需要拓展学习。我所知道的是作者引入了很多参数($\mu_1$,$\mu_2$,$\theta_{\alpha}$,$\theta_{\beta}$,$\theta_{\gamma}$),来解一个entropy funciton,用到的方法依然是grid search

1. 3D DSN > 3D CNN | CRF works good

EVALUATIONEVALUATION
VISUALIZATIONVISUALIZATION

2. Shorter runtime - 5s for 3D DSN and 87s for CRF.

COMPARISON WITH OTHER TEAMCOMPARISON WITH OTHER TEAM
可以看出,3D到网络运行到时间很短,而条件随机场处理很费时间。

[4] 3D Fully Convolutional Networks for Intervertebral Disc Localization and Segmentation. paper

这篇文章在算法上就只是把2D的FCN变成了3D的FCN,其他没有什么改进的地方,应用到了一个椎间盘的分割数据集中。

1. 3D FCN > 2D FCN

TEST1TEST1
TEST2TEST2

总体来看,这篇论文的论点很简单,方法有创新(2D$\longrightarrow$3D),但是比较常规,结论也很简单,但是从我的角度看很有学习的必要,因为在这种情况下要发表,很考验写作的能力了,举例来说,写实验结果的时候,如果让我写,那就是一句话:3D FCN performs better than 2D FCN both in IVD localization and segmentation. 完事儿了。:-)

[5] VoxResNet: Deep Voxelwise Residual Networks for Volumetric Brain Segmentation. H Chen, Q Dou, L Yu, P Heng [CUHK] (2016). paper.

The architecture of VoxResNetThe architecture of VoxResNet
auto-contextauto-context
Comparison of VoxResNet, Auto-context VoxResNet and Ground truthComparison of VoxResNet, Auto-context VoxResNet and Ground truth

[6] Evaluation and comparison of 3D intervertebral disc localization and segmentation methods for 3D T2 MR data: A grand challenge. paper

这篇期刊是对椎间盘检测和分割[Review.4]的一个比较详细的介绍,也让我直观的感觉到了会议论文和期刊论文的区别,期刊就像对会议论文的每一个点都展开来描述的一样。随着CVPR,IPMI,MICCAI投完,我们也要开始投期刊了,把几个会议的内容充实起来,变成一篇丰满的期刊~没有时间仔细看了!Review到此为止。


Related works

[1] V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. Fausto Milletari, Nassir Navab, Seyed-Ahmad Ahmadi [Johns Hopkins University]. paper.

The architecture of V-NetThe architecture of V-Net


[2] 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. Ahmed Abdulkadir, Soeren S. Lienkamp, Thomas Brox [University of Freiburg, Google Deepmind]. paper, code (Caffe).

2D U-Net Architecture2D U-Net Architecture
3D U-Net Architecture3D U-Net Architecture

[3] Deep MRI brain extraction: A 3D convolutional neural network for skull stripping. Jens Kleesiek, Gregor Urban, Alexander Hubert [Heidelberg University Hospital]. paper.

CNN architecture detailsCNN architecture details

[4] Integrating Online and Offline 3D Deep Learning for Automated Polyp Detection in Colonoscopy Videos. Lequan Yu, Hao Chen, Qi Dou [CUHK] (2016). paper

Offline 3D FCN 1Offline 3D FCN 1
Offline 3D FCN 2Offline 3D FCN 2
Offline 3D FCN 3Offline 3D FCN 3
ComparisonComparison

Discussions online

1. Are there any deep learning libraries that have 3D volumetric/spatial convolutions running on a CPU or a GPU?

A recent addition, but Keras now supports 3D convolution. It should work for voxels and video sequences.

2. 3D CNN in Keras - Action Recognition

3. Software: https://github.com/facebook/C3D


Separable 3D CNN

1. References papers

[1] Learning Separable Filters. Amos Sironi, Bugra Tekin, Roberto Rigamonti [EPFL] 2014. paper -- check Section 5.5.

2. Try on

Examine the separability of the kernels in the pre-trained CNNs, check http://www.mathworks.com/matlabcentral/fileexchange/28238-kernel-decomposition


Some Questions


祝好!

上一篇 下一篇

猜你喜欢

热点阅读