Lecture 8 | (1/3) Convolutional

2019-10-21 本文已影响0人 Ysgc

shouldn't have a single MLP and output

at least one of the output is closed to 1 if "welcome" is in the recording

just the sum of two derivative

w_ij = w_mn = w^s; so that the step to any one of the weight affects the others

sum them up

all the convolution layers, (layers exclude the input layer and softmax layer) can be done in parallel

changing the order to parallelize the algorithm
so long the first layer has done its job, the second layer can start

input channel 3 -> first layer channel 4 -> second layer channel 2 -> output channel 1 （flatten）

layers first, then dimension

distributing the patterns through layers, and have better representation

first layer looks only at a small region now

second layer behaves like the first layer

one dot in 2nd layer looks at a much larger region

higher layers learn implicitly the arrangement of the sub patterns

extend this process in three layers: first layer's region is even smaller

这好像不是tensor inner prod吧，只是elementwise multiplication？？？

eg:
W: 9*3*5*5
Y: 3*5*5
W*Y = 9*1

max pooling is also a filter

NN trained in one scale, can not be applied to a different scale -> the NN automatically learns the size of the object (say, flower) -> given flowers in different scales in dataset -> find it much harder to learn

上一篇下一篇

猜你喜欢

热点阅读