# Compute SVD Gradient in TensorFlow After Replacing tf.svd() with numpy.linalg.svd() – TensorFlow Tutorial

By | January 7, 2020

It is very easy to compute svd gradient if we use tf.svd() to compute the singular value decomposition of a tensor, however, we often have to replace tf.svd() with numpy.linalg.svd(). There are two main reason:

• TensorFlow tf.svd() runs very slowly
• TensorFlow tf.svd() may return NaN value

Read More: Solve tf.svd NaN bug with np.linalg.svd

We can use tf.py_func() to replace tf.svd() with numpy.linalg.svd(). However, we will find the gradient of svd is none.

Here is an example code:

u, s, v = tf.py_func(np.linalg.svd, [tensor, full_matrices, compute_uv],[dtype, dtype, dtype])

How to fix the problem: how to compute the gradient of svd after we have replaced tf.svd() with numpy.linalg.svd()?

There are three difficulties:

• We calculate singular value decomposition (u, s, v) in numpy.linalg.svd() by tf.py_func(). The tensor will be converted to numpy.ndarray type. We shoud compute the gradient of svd in numpy.

To understand tf.py_func()  you should read:

TensorFlow tf.py_func(): Run Python Function in TensorFlow Graph

• The formula of the gradient of svd is very complex
• Even if you have computed the gradient of svd in numpy, how to compute it to tensor when running?

To fix all problems, we can do like this:

pip install autograd

Python autograd package can allow us to compute the gradient of a numpy function automatically.

## Use np_svd_in_tf() to compuate svd

def np_svd_in_tf(w, name = 'np_svd_in_tf'):
with tf.name_scope(name):
def computeSVD(w):
S = np.linalg.svd(w, compute_uv = False )
#print(S)
return S

@function.Defun()

def np_replaced_tf_svd(w):
return tf.py_func(computeSVD, [w], tf.float32)
return np_replaced_tf_svd(w)

np_svd_in_tf() can compute singular value and can process gradient.

We will use some examples to test it.

## Create some tensors

    np_w = np.array([[[2,2,3,4,5],[6,7,2,9,0],[1,2,2,4,5],[6,2,8,9,0],[1,2,3,4,5]]], dtype = np.float32)
w1 = tf.convert_to_tensor(np_w)

w1 = tf.Variable(np.array([[2, 3, 5, 1, 3],[2, 3, 5, 1, 3]]), dtype = tf.float32)
w2 = tf.Variable(np.array([[2, 2, 5],[2, 3, 5],[2, 3, 5], [2, 3, 5], [2, 3, 5]]), dtype = tf.float32)
w3 = tf.matmul(w1, w2)
w4 = tf.nn.softmax(w3, axis = 1)
w4 = tf.reshape(w4,[-1, 2, 3])

Using tf.svd()

s = tf.svd(w4, compute_uv = False)
tf_svd_grad = tf.gradients(s, w2)

Use np_svd_in_tf()

s_in_np = np_svd_in_tf(w4)
svd_grad = tf.gradients(s_in_np,w2)

## Test result

with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print("tf.svd() s:\n")
print(sess.run(s))
print("np_replaced_tf_svd s:\n")
print(sess.run(s_in_np))

print(sess.run(svd_grad)[0])

The result is:

tf.svd() s:

[[1.4142135 0.       ]]
np_replaced_tf_svd s:

[[1.4142135 0.       ]]

[[-1.6262104e-18 -2.6467354e-13  0.0000000e+00]
[-2.4393155e-18 -3.9701031e-13  0.0000000e+00]
[-4.0655261e-18 -6.6168382e-13  0.0000000e+00]
[-8.1310520e-19 -1.3233677e-13  0.0000000e+00]
[-2.4393155e-18 -3.9701031e-13  0.0000000e+00]]
[-2.4393155e-18 -3.9701031e-13  0.0000000e+00]]