BNN常用激活函数总结

1. 常用激活函数

1.1 阔值激活函数

当激活值大于0时为1，否则为0
由于导数在 x=0 时，不连续，所以不可以用于梯度下降训练

# Customize an threshold activation function
def threshold(x):
    cond = tf.less(x, tf.zeros(tf.shape(x), dtype=x.dtype))
    out = tf.where(cond, tf.zeros(tf.shape(x)), tf.ones(tf.shape(x)))
    return out


# plot
x = np.linspace(-1, 1, 50)
out = threshold(x)
tf.compat.v1.disable_eager_execution()
sess = tf.compat.v1.Session()
with tf.compat.v1.Session() as sess:
    y = sess.run(tf.constant(out))
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Threshold Activation Function')
plt.plot(x, y)
plt.show()

Graph

1.2 Sigmoid

tensorflow内置了 **tf.sigmoid()**实现
输出值范围 (0, 1)
该函数在两端导数为趋近于0，故会出现梯度消失（梯度弥散现象）问题，使得样本训练和优化变得越发困难

# plot
x = np.linspace(-10, 10, 50)
out = tf.sigmoid(x)
tf.compat.v1.disable_eager_execution()
sess = tf.compat.v1.Session()
with tf.compat.v1.Session() as sess:
    y = sess.run(tf.constant(out))
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Sigmoid Activation Function')
plt.plot(x, y)
plt.show()

Graph

1.3 ReLU

负值输入为0，正值输入时，输出值和输入值相同
负值输入为0的特性（稀疏激活）使得计算量大大减少
避免了梯度弥散，是大部分网络模型的首选激活函数

# plot
x = np.linspace(-10, 10, 50)
out = tf.nn.relu(x)
tf.compat.v1.disable_eager_execution()
sess = tf.compat.v1.Session()
with tf.compat.v1.Session() as sess:
    y = sess.run(tf.constant(out))
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Sigmoid Activation Function')
plt.plot(x, y)
plt.show()

Graph

1.4 Softmax

函数输出值范围 [0, 1]
所有输出值和为1，故改激活函数被广泛用于分类任务网络模型的输出层

# plot
x = np.linspace(-10, 10, 50)
out = tf.nn.softmax(x)
tf.compat.v1.disable_eager_execution()
sess = tf.compat.v1.Session()
with tf.compat.v1.Session() as sess:
    y = sess.run(tf.constant(out))
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Softmax Activation Function')
plt.plot(x, y)
plt.show()

Graph

2. Notes

ReLU函数由于其函数特性，而具有单侧抑制性，同时避免了梯度弥散的出现，因此在多层网络模型中都表现良好，所以在深层网络模型中，最常用的激活函数还是ReLU