Softmax function is differentiable, however, if you get the gradient of it by tf.gradients(), you will get 0. In this tutorial, we will explain the reason for tensorflow beginners.

Look at example code below:

import tensorflow as tf import numpy as np z = tf.Variable(np.array([[1, 2],[3, 2]]), dtype = tf.float32) y = tf.nn.softmax(z, axis = 1) r = tf.gradients(y,z) init = tf.global_variables_initializer() init_local = tf.local_variables_initializer() with tf.Session() as sess: sess.run([init, init_local]) print(sess.run([y])) print(sess.run([r]))

Run this python code, you will get result like:

[array([[0.26894143, 0.7310586 ], [0.7310586 , 0.26894143]], dtype=float32)] [[array([[0., 0.], [0., 0.]], dtype=float32)]]

The value of tf.gradients() is 0.

**Why the value of tf.gradients() is 0?**

To understand the reason, you should understand these two topic:

How to compute the gradient of softmax function.

How to tf.gradients() return value in tensorflow.

As to code above, it can be expressed as:

The gradient of x_{00} based on y is computed by tf.gradient() as:

So the value of tf.gradients() on tf.nn.softmax() is 0.