深度学习中的激活函数计算公式总结

2019-11-21 本文已影响0人晨光523152

看tensorflow 2.0 的官方API（主要是tf.keras.activations），其中激活函数部分总结如下。链接如下：https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/activations

tf.keras.activations.elu

elu: Exponential linear unit (指数线性单元)

elu里面的函数定义如下：

@keras_export('keras.activations.elu')
def elu(x, alpha=1.0):
  """Exponential linear unit.
  Arguments:
      x: Input tensor.
      alpha: A scalar, slope of negative section.
  Returns:
      The exponential linear activation: `x` if `x > 0` and
        `alpha * (exp(x)-1)` if `x < 0`.
  Reference:
      - [Fast and Accurate Deep Network Learning by Exponential
        Linear Units (ELUs)](https://arxiv.org/abs/1511.07289)
  """
  return K.elu(x, alpha)

elu的公式计算如下：
$\begin{cases} x & if\;\; x > 0 \\ alpha * (e^{x}-1) & if\;\; x < 0 \;\;\;\;\;\;\;\;(1) \end{cases}$
函数图如下：

elu

tf.keras.activations.exponential

exponential：指数

@keras_export('keras.activations.exponential')
def exponential(x):
  """Exponential activation function.
  Arguments:
      x: Input tensor.
  Returns:
      The exponential activation: `exp(x)`.
  """
  return math_ops.exp(x)

exponential的计算公式：
$e^{x}\;\;\;\;\;\;\;\;(2)$

exponential

tf.keras.activations.hard_sigmoid

hard_sigmoid：Hard sigmoid activation function

@keras_export('keras.activations.hard_sigmoid')
def hard_sigmoid(x):
  """Hard sigmoid activation function.
  Faster to compute than sigmoid activation.
  Arguments:
      x: Input tensor.
  Returns:
      Hard sigmoid activation:
      - `0` if `x < -2.5`
      - `1` if `x > 2.5`
      - `0.2 * x + 0.5` if `-2.5 <= x <= 2.5`.
  """
  return K.hard_sigmoid(x)

hard_sigmoid

tf.keras.activations.linear

@keras_export('keras.activations.linear')
def linear(x):
  """Linear activation function.
  Arguments:
      x: Input tensor.
  Returns:
      The linear activation: `x`.
  """
  return x

linear的计算公式如下：
$x \;\;\;\;\;\;\;\;(3)$

tf.keras.activations.relu

relu：Rectified Linear Unit （整流线性单元）

@keras_export('keras.activations.relu')
def relu(x, alpha=0., max_value=None, threshold=0):
  """Rectified Linear Unit.
  With default values, it returns element-wise `max(x, 0)`.
  Otherwise, it follows:
  `f(x) = max_value` for `x >= max_value`,
  `f(x) = x` for `threshold <= x < max_value`,
  `f(x) = alpha * (x - threshold)` otherwise.
  Arguments:
      x: A tensor or variable.
      alpha: A scalar, slope of negative section (default=`0.`).
      max_value: float. Saturation threshold.
      threshold: float. Threshold value for thresholded activation.
  Returns:
      A tensor.
  """
  return K.relu(x, alpha=alpha, max_value=max_value, threshold=threshold)

relu的计算公式如下：
$\begin{cases} maxvalue & if\;x >= maxvalue\\ x & if\; threshold <= x <maxvalues \\ alpha * (x - threshold) & otherwise \;\;\;\;\;\;\;\;(4) \end{cases}$

relu

tf.keras.activations.selu

selu：Scaled Exponential Linear Unit

@keras_export('keras.activations.selu')
def selu(x):
  """Scaled Exponential Linear Unit (SELU).
  The Scaled Exponential Linear Unit (SELU) activation function is:
  `scale * x` if `x > 0` and `scale * alpha * (exp(x) - 1)` if `x < 0`
  where `alpha` and `scale` are pre-defined constants
  (`alpha = 1.67326324`
  and `scale = 1.05070098`).
  The SELU activation function multiplies  `scale` > 1 with the
  `[elu](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/activations/elu)`
  (Exponential Linear Unit (ELU)) to ensure a slope larger than one
  for positive net inputs.
  The values of `alpha` and `scale` are
  chosen so that the mean and variance of the inputs are preserved
  between two consecutive layers as long as the weights are initialized
  correctly (see [`lecun_normal` initialization]
  (https://www.tensorflow.org/api_docs/python/tf/keras/initializers/lecun_normal))
  and the number of inputs is "large enough"
  (see references for more information).
  ![](https://cdn-images-1.medium.com/max/1600/1*m0e8lZU_Zrkh4ESfQkY2Pw.png)
  (Courtesy: Blog on Towards DataScience at
  https://towardsdatascience.com/selu-make-fnns-great-again-snn-8d61526802a9)
  Example Usage:
  python3
  n_classes = 10 #10-class problem
  model = models.Sequential()
  model.add(Dense(64, kernel_initializer='lecun_normal', activation='selu',
  input_shape=(28, 28, 1))))
  model.add(Dense(32, kernel_initializer='lecun_normal', activation='selu'))
  model.add(Dense(16, kernel_initializer='lecun_normal', activation='selu'))
  model.add(Dense(n_classes, activation='softmax'))
  Arguments:
      x: A tensor or variable to compute the activation function for.
  Returns:
      The scaled exponential unit activation: `scale * elu(x, alpha)`.
  # Note
      - To be used together with the initialization "[lecun_normal]
      (https://www.tensorflow.org/api_docs/python/tf/keras/initializers/lecun_normal)".
      - To be used together with the dropout variant "[AlphaDropout]
      (https://www.tensorflow.org/api_docs/python/tf/keras/layers/AlphaDropout)".
  References:
      [Self-Normalizing Neural Networks (Klambauer et al, 2017)]
      (https://arxiv.org/abs/1706.02515)
  """
  alpha = 1.6732632423543772848170429916717
  scale = 1.0507009873554804934193349852946
  return scale * K.elu(x, alpha)

selu的计算公式如下：
$\begin{cases} scale * x & if\; x>0 \\ scale * alpha *(e^{x}-1) & if\; x<0 \;\;\;\;\;\;\;\;(5) \end{cases}$

selu

tf.keras.activations.sigmoid

sigmoid：

@keras_export('keras.activations.sigmoid')
def sigmoid(x):
  """Sigmoid.
  Applies the sigmoid activation function. The sigmoid function is defined as
  1 divided by (1 + exp(-x)). It's curve is like an "S" and is like a smoothed
  version of the Heaviside (Unit Step Function) function. For small values
  (<-5) the sigmoid returns a value close to zero and for larger values (>5)
  the result of the function gets close to 1.
  Arguments:
      x: A tensor or variable.
  Returns:
      A tensor.
  Sigmoid activation function.
  Arguments:
      x: Input tensor.
  Returns:
      The sigmoid activation: `(1.0 / (1.0 + exp(-x)))`.
  """
  return nn.sigmoid(x)

sigmoid的计算方式如下：
$\frac{1}{1 + e^{-x}}\;\;\;\;\;\;\;\;(6)$

sigmoid

tf.keras.activations.softmax

softmax：

@keras_export('keras.activations.softmax')
def softmax(x, axis=-1):
  """Softmax converts a real vector to a vector of categorical probabilities.
  The elements of the output vector are in range (0, 1) and sum to 1.
  Each vector is handled independently. The `axis` argument sets which axis
  of the input the function is applied along.
  Softmax is often used as the activation for the last
  layer of a classification network because the result could be interpreted as
  a probability distribution.
  The softmax of each vector x is calculated by `exp(x)/tf.reduce_sum(exp(x))`.
  The input values in are the log-odds of the resulting probability.
  Arguments:
      x : Input tensor.
      axis: Integer, axis along which the softmax normalization is applied.
  Returns:
      Tensor, output of softmax transformation (all values are non-negative
        and sum to 1).
  Raises:
      ValueError: In case `dim(x) == 1`.
  """
  ndim = K.ndim(x)
  if ndim == 2:
    return nn.softmax(x)
  elif ndim > 2:
    e = math_ops.exp(x - math_ops.reduce_max(x, axis=axis, keepdims=True))
    s = math_ops.reduce_sum(e, axis=axis, keepdims=True)
    return e / s
  else:
    raise ValueError('Cannot apply softmax to a tensor that is 1D. '
                     'Received input: %s' % (x,))

softmax的计算公式如下：
$\frac{e^{x_{i}}}{\sum_{j=1}^{N}e^{x_{j}}}\;\;\;\;\;\;\;\;(7)$

softmax

tf.keras.activations.softplus

softplus：Softplus activation function

@keras_export('keras.activations.softplus')
def softplus(x):
  """Softplus activation function.
  Arguments:
      x: Input tensor.
  Returns:
      The softplus activation: `log(exp(x) + 1)`.
  """
  return nn.softplus(x)

softplus的计算公式如下：
$log(e^{x} + 1)\;\;\;\;\;\;\;\;(8)$

softplus

tf.keras.activations.softsign

softsign：Softsign activation function

@keras_export('keras.activations.softsign')
def softsign(x):
  """Softsign activation function.
  Arguments:
      x: Input tensor.
  Returns:
      The softsign activation: `x / (abs(x) + 1)`.
  """
  return nn.softsign(x)

softsign的计算公式如下：
$\frac{x}{ |x| + 1}\;\;\;\;\;\;\;\;(9)$

softsign

tf.keras.activations.tanh

tanh：Hyperbolic tangent activation function（双曲正切激活函数）

@keras_export('keras.activations.tanh')
def tanh(x):
  """Hyperbolic tangent activation function.
  For example:
  >>> a = tf.constant([-3.0,-1.0, 0.0,1.0,3.0], dtype = tf.float32)
  >>> b = tf.keras.activations.tanh(a)
  >>> b.numpy()
  array([-0.9950547, -0.7615942,  0.       ,  0.7615942,  0.9950547],
          dtype=float32)
  Arguments:
      x: Input tensor.
  Returns:
      Tensor of same shape and dtype of input `x`, with tanh activation:
      `tanh(x) = sinh(x)/cosh(x) = ((exp(x) - exp(-x))/(exp(x) + exp(-x)))`.
  """
  return nn.tanh(x)

tanh的计算公式如下：
$\frac{e^{x} - e^{-x}}{e^{x} + e^{-x}}\;\;\;\;\;\;\;\;(10)$

tanh