深度学习中的激活函数计算公式总结
2019-11-21 本文已影响0人
晨光523152
看tensorflow 2.0 的官方API(主要是tf.keras.activations),其中激活函数部分总结如下。链接如下:https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/activations
tf.keras.activations.elu
elu: Exponential linear unit (指数线性单元)
elu里面的函数定义如下:
@keras_export('keras.activations.elu')
def elu(x, alpha=1.0):
"""Exponential linear unit.
Arguments:
x: Input tensor.
alpha: A scalar, slope of negative section.
Returns:
The exponential linear activation: `x` if `x > 0` and
`alpha * (exp(x)-1)` if `x < 0`.
Reference:
- [Fast and Accurate Deep Network Learning by Exponential
Linear Units (ELUs)](https://arxiv.org/abs/1511.07289)
"""
return K.elu(x, alpha)
elu的公式计算如下:
函数图如下:
tf.keras.activations.exponential
exponential:指数
@keras_export('keras.activations.exponential')
def exponential(x):
"""Exponential activation function.
Arguments:
x: Input tensor.
Returns:
The exponential activation: `exp(x)`.
"""
return math_ops.exp(x)
exponential的计算公式:
tf.keras.activations.hard_sigmoid
hard_sigmoid:Hard sigmoid activation function
@keras_export('keras.activations.hard_sigmoid')
def hard_sigmoid(x):
"""Hard sigmoid activation function.
Faster to compute than sigmoid activation.
Arguments:
x: Input tensor.
Returns:
Hard sigmoid activation:
- `0` if `x < -2.5`
- `1` if `x > 2.5`
- `0.2 * x + 0.5` if `-2.5 <= x <= 2.5`.
"""
return K.hard_sigmoid(x)
hard_sigmoid
tf.keras.activations.linear
@keras_export('keras.activations.linear')
def linear(x):
"""Linear activation function.
Arguments:
x: Input tensor.
Returns:
The linear activation: `x`.
"""
return x
linear的计算公式如下:
tf.keras.activations.relu
relu:Rectified Linear Unit (整流线性单元)
@keras_export('keras.activations.relu')
def relu(x, alpha=0., max_value=None, threshold=0):
"""Rectified Linear Unit.
With default values, it returns element-wise `max(x, 0)`.
Otherwise, it follows:
`f(x) = max_value` for `x >= max_value`,
`f(x) = x` for `threshold <= x < max_value`,
`f(x) = alpha * (x - threshold)` otherwise.
Arguments:
x: A tensor or variable.
alpha: A scalar, slope of negative section (default=`0.`).
max_value: float. Saturation threshold.
threshold: float. Threshold value for thresholded activation.
Returns:
A tensor.
"""
return K.relu(x, alpha=alpha, max_value=max_value, threshold=threshold)
relu的计算公式如下:
tf.keras.activations.selu
selu:Scaled Exponential Linear Unit
@keras_export('keras.activations.selu')
def selu(x):
"""Scaled Exponential Linear Unit (SELU).
The Scaled Exponential Linear Unit (SELU) activation function is:
`scale * x` if `x > 0` and `scale * alpha * (exp(x) - 1)` if `x < 0`
where `alpha` and `scale` are pre-defined constants
(`alpha = 1.67326324`
and `scale = 1.05070098`).
The SELU activation function multiplies `scale` > 1 with the
`[elu](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/activations/elu)`
(Exponential Linear Unit (ELU)) to ensure a slope larger than one
for positive net inputs.
The values of `alpha` and `scale` are
chosen so that the mean and variance of the inputs are preserved
between two consecutive layers as long as the weights are initialized
correctly (see [`lecun_normal` initialization]
(https://www.tensorflow.org/api_docs/python/tf/keras/initializers/lecun_normal))
and the number of inputs is "large enough"
(see references for more information).
![](https://cdn-images-1.medium.com/max/1600/1*m0e8lZU_Zrkh4ESfQkY2Pw.png)
(Courtesy: Blog on Towards DataScience at
https://towardsdatascience.com/selu-make-fnns-great-again-snn-8d61526802a9)
Example Usage:
python3
n_classes = 10 #10-class problem
model = models.Sequential()
model.add(Dense(64, kernel_initializer='lecun_normal', activation='selu',
input_shape=(28, 28, 1))))
model.add(Dense(32, kernel_initializer='lecun_normal', activation='selu'))
model.add(Dense(16, kernel_initializer='lecun_normal', activation='selu'))
model.add(Dense(n_classes, activation='softmax'))
Arguments:
x: A tensor or variable to compute the activation function for.
Returns:
The scaled exponential unit activation: `scale * elu(x, alpha)`.
# Note
- To be used together with the initialization "[lecun_normal]
(https://www.tensorflow.org/api_docs/python/tf/keras/initializers/lecun_normal)".
- To be used together with the dropout variant "[AlphaDropout]
(https://www.tensorflow.org/api_docs/python/tf/keras/layers/AlphaDropout)".
References:
[Self-Normalizing Neural Networks (Klambauer et al, 2017)]
(https://arxiv.org/abs/1706.02515)
"""
alpha = 1.6732632423543772848170429916717
scale = 1.0507009873554804934193349852946
return scale * K.elu(x, alpha)
selu的计算公式如下:
tf.keras.activations.sigmoid
sigmoid:
@keras_export('keras.activations.sigmoid')
def sigmoid(x):
"""Sigmoid.
Applies the sigmoid activation function. The sigmoid function is defined as
1 divided by (1 + exp(-x)). It's curve is like an "S" and is like a smoothed
version of the Heaviside (Unit Step Function) function. For small values
(<-5) the sigmoid returns a value close to zero and for larger values (>5)
the result of the function gets close to 1.
Arguments:
x: A tensor or variable.
Returns:
A tensor.
Sigmoid activation function.
Arguments:
x: Input tensor.
Returns:
The sigmoid activation: `(1.0 / (1.0 + exp(-x)))`.
"""
return nn.sigmoid(x)
sigmoid的计算方式如下:
tf.keras.activations.softmax
softmax:
@keras_export('keras.activations.softmax')
def softmax(x, axis=-1):
"""Softmax converts a real vector to a vector of categorical probabilities.
The elements of the output vector are in range (0, 1) and sum to 1.
Each vector is handled independently. The `axis` argument sets which axis
of the input the function is applied along.
Softmax is often used as the activation for the last
layer of a classification network because the result could be interpreted as
a probability distribution.
The softmax of each vector x is calculated by `exp(x)/tf.reduce_sum(exp(x))`.
The input values in are the log-odds of the resulting probability.
Arguments:
x : Input tensor.
axis: Integer, axis along which the softmax normalization is applied.
Returns:
Tensor, output of softmax transformation (all values are non-negative
and sum to 1).
Raises:
ValueError: In case `dim(x) == 1`.
"""
ndim = K.ndim(x)
if ndim == 2:
return nn.softmax(x)
elif ndim > 2:
e = math_ops.exp(x - math_ops.reduce_max(x, axis=axis, keepdims=True))
s = math_ops.reduce_sum(e, axis=axis, keepdims=True)
return e / s
else:
raise ValueError('Cannot apply softmax to a tensor that is 1D. '
'Received input: %s' % (x,))
softmax的计算公式如下:
tf.keras.activations.softplus
softplus:Softplus activation function
@keras_export('keras.activations.softplus')
def softplus(x):
"""Softplus activation function.
Arguments:
x: Input tensor.
Returns:
The softplus activation: `log(exp(x) + 1)`.
"""
return nn.softplus(x)
softplus的计算公式如下:
tf.keras.activations.softsign
softsign:Softsign activation function
@keras_export('keras.activations.softsign')
def softsign(x):
"""Softsign activation function.
Arguments:
x: Input tensor.
Returns:
The softsign activation: `x / (abs(x) + 1)`.
"""
return nn.softsign(x)
softsign的计算公式如下:
tf.keras.activations.tanh
tanh:Hyperbolic tangent activation function(双曲正切激活函数)
@keras_export('keras.activations.tanh')
def tanh(x):
"""Hyperbolic tangent activation function.
For example:
>>> a = tf.constant([-3.0,-1.0, 0.0,1.0,3.0], dtype = tf.float32)
>>> b = tf.keras.activations.tanh(a)
>>> b.numpy()
array([-0.9950547, -0.7615942, 0. , 0.7615942, 0.9950547],
dtype=float32)
Arguments:
x: Input tensor.
Returns:
Tensor of same shape and dtype of input `x`, with tanh activation:
`tanh(x) = sinh(x)/cosh(x) = ((exp(x) - exp(-x))/(exp(x) + exp(-x)))`.
"""
return nn.tanh(x)
tanh的计算公式如下: