Person Re-Identification by Deep

2017-06-03 本文已影响0人 __Vision

introduction

we formulate a method for joint learning of local and global feature selection losses designed to optimise person re-id when using only generic matching metrics such as the L2 distance.即联合学习局部和全局特征。作者认为learning any matching distance metric is intrinsically learn- ing a global feature transformation across domains，所以其实特征的度量用简单的比如L2就可以了，主要应该聚焦于特征的提取和表达。
传统的手工提取特征主要提取的是局部特征，比如把图像切分成水平的条状来处理。而dl（deep learning）的方法主要提取的是图像的全局特征。但是作者认为这两种处理方式得到的特征都不是最优的，两者结合才好，因为人的视觉系统是同时处理这两种特征（global (contextual) and local (saliency) information）的。仔细想想，还是有那么点道理的。
作者的网络设计也是从这个角度出发，有两个branch，分别提取局部特征和全局特征，但是这个两个branch不是独立的，而是相互影响，共同学习的。这样一个网络的好处在于，不但可以同时提取局部和全局的特征，还可以学习局部和全局的关系，两者相互补足，来解决局部错位等reID中的典型问题。
此外，作者还introduce a structured sparsity based feature selection learning mechanism for improving multi- loss joint feature learning robustness w.r.t. noise and data co- variance between local and global representations.意思大概就是这是一种基于稀疏性的正则化的手段，用来解决噪声影响。

related work

1.saliency learning based models。这些方法不考虑全局特征，主要modelling localised part im- portance. However, these existing methods consider only the patch appearance statistics within individual locations but no global feature representation learning, let alone the correla- tion and complementary information discovery between local and global features as modelled by the JLML.
2.Spatially Constrained Similarity (SCS) model和Multi-Channel Parts (MCP) network 。这两个方法倒是同时考虑了全局特征。SCS主要聚焦于 supervised metric learning。但是SCS不考虑hand-crafted local and global features之间的关系。MCP主要用triplet ranking loss（不懂）来优化，而JLML主要用multiple classification loss，前者存在一定坏处：Critically, this one-loss model learning is likely to impose negative influ- ence on the discriminative feature learning behaviour for both branches due to potential over-low pre-branch independence and over-high inter-branch correlation. This may lead to sub- optimal joint learning of local and global feature selections in model optimisation, as suggested by our evaluation in Section4.3
3.HER model。主要用了regression loss，而JLML主要用的是classification loss。
4.DGD。这篇文章我仔细看过，它用的也是classification loss。和JLML的区别在于他是one-loss classification 而JLML是 multi-loss classifi- cation

模型设计

image.png

（Note that, the ReLU，rectification non-linearity [Krizhevsky et al., 2012] after each conv layer is omitted for brevity.）

两个分支分别提取局部和全局特征。联合学习体现在下面两个方面：
1.low level的特征共享。有两个好处，第一，共享特征，第二，减少参数，防止过拟合，尤其是在reID这个问题上，因为reID的数据集比较小
2.最后把两个512维的特征向量叠加（local and global）

损失函数

这里他们的损失函数的选择不同于大多数现存的deep reID方法，他们的损失函数主要用的是 cross- entropy classification loss function。显存的deep reID方法主要用的contrastive loss，designed to exploit pairwise re-id labels de- fined by both positive and negative pairs, such as the pairwise verification。代表之一是An improved deep learning architecture for person re- identification. In CVPR, 2015.
这么选择损失函数的理由如下（不翻译了，说的还挺有道理的）：The motivations for our JLML classification loss based learning are: (i) Significantly simplified training data batch construc- tion, e.g. random sampling with no notorious tricks required, as shown by other deep classification methods [Krizhevsky et al., 2012]. This makes our JLML model more scalable in real-world applications with very large training population sizes when available. This also eliminates the undesirable need for carefully forming pairs and/or triplets in preparing re-id training splits, as in most existing methods, due to the inherent imbalanced negative and positive pair size distribu- tions. (ii) Visual psychophysical findings suggest that rep- resentations optimised for classification tasks generalise well to novel categories [Edelman, 1998]. We consider that re- id tasks are about model generalisation to unseen test iden- tity classes given training data on independent seen identity classes. Our JLML model learning exploits this general clas- sification learning principle beyond the strict pair-wise rela- tive verification loss in existing re-id models.大意就是不要用正负样本这种形式，直接用正样本。DGD这篇文章也是用的一样的思想。

其他

最后就是一些训练细节，以及对模型各种方法有和没有的比较，证明这些方法是有好处的。好处最明显的就是联合global和local特征了：

image.png

还有就是两个分支单独学习比一起学习要好：

image.png

其他的比如有没有low level的shared feature和metric learning的选择，以及selective feature learning（就是那个看不懂的正则化），作用甚微。

Person Re-Identification by Deep

introduction

related work

模型设计

损失函数

其他

猜你喜欢

热点阅读