2018-08-09 deep NN

2019-02-18  本文已影响0人  镜中无我

preface:

deep learning :a collection of highly complicated data modeling algorithms achieved via multiple layers of nonlinear translation
in a sense ,deep learning equals to DNN

linear models are of great limitation

multi-layers equals to single layer

activation achieves non-linearization

tensorflow provides 7 activations
such as:relu,sigmoid,tanh and etc

multilayers to solve exclusive OR problems

typical point:conpound features extraction

loss function

optimizing algorithms

about learning rate

#tf.train.exponential_decay
#realization
decayed_learning _rate=learning_rate*decay_rate^(global_step/decay_step)
#----------use
learning_rate=tf.train.exponential_decay(0.1,global_step,100,0.96,staircase=True)
#staircase is true ,so multiply it with 0.96 every 100 steps,namely the function is stair-shaped

over-fitting

definition:memorize the random noise instead of learning the total trend
way to avoid: regularization
a metric depicting the complexity about coefficient

#it is a object 
# constructing para.
def __init__(self,decay,num_updates=None,zero_debias=False)
# args:decay for calculate the value of shadow variable i.e. object.average(variable),num_updates for updating decay
# usually the global_step
# global_step is the assistant variable and will add by 1 every training apoch
# member function apply() which is called to create a shadow variable with updated value
# object.apply(self,var_list=None)
# algorithm: decay=min{DECAY,(1.0+num_updates)/(10.0+num_updates)},DECAY is fixed
# object.average() for getting the value : shadow_variable = decay * shadow_variable + (1-decay) * variable
# control the distance between now and before and slower the change
# this method won’t change the para. but will influence the gradient descent via adjusting the result of forward-prob.
上一篇下一篇

猜你喜欢

热点阅读