Understand LSTM Kernel in TensorFlow for Beginners – TensorFlow Tutorial

By | July 19, 2020

Each lstm only contains a kernel, you can get this kernel name.

Get LSTM Cell Weights and Regularize LSTM in TensorFlow

As to lstm kernel, there are some important tips you should know:

Tip 1: All weights of lstm is in kernel

Wxi, Whi, Wxf, Whf, Wxo , Who , Wxcand Whc are contained in lstm kernel.

Tip 2: A lstm layer only has a kernel, not a time step correspond to a kernel.

You may find some model structures in processing time series problems by lstm like below.

As picture above, there are three lstm cells, does it mean there are three lstm kernel?

The answer is no, there is only one kernel.  it should be:

Tip 3: The shape of lstm kernel is  [input_depth + h_depth, 4 * self._num_units]

The kernel is built by:

    self._kernel = self.add_variable(
_WEIGHTS_VARIABLE_NAME,
shape=[input_depth + h_depth, 4 * self._num_units])

You can find this source code in here:

https://github.com/tensorflow/tensorflow/blob/r1.8/tensorflow/python/ops/rnn_cell_impl.py

As to input_depth and h_depth, they are defined to:

    input_depth = inputs_shape[1].value
h_depth = self._num_units

For example:

If the inputs is 64 *200, which means input_depth = 200

the num_units = 100, which means h_depth = 100

The kernel of lstm is (300, 100)

We can find there should be 4 matrix with shape (200 * 100, 100 * 100). TensorFlow converts them to a matrix ( 300, 400).