As to LSTM, it has three gates, they are:

Input transform:

To regularize LSTM, we should get gates weights in each lstm, they are ** W_{xi}, W_{hi}, W_{xf}, W_{hf},W_{xo}** ,

*and*

**W**_{ho}, W_{xc}

**W**_{hc}## How to Get these weights?

## Step 1: Get all variables in LSTM

List All Trainable and Untrainable Variables in TensorFlow

List All Variables including Constant and Placeholder in TensorFlow

First, we use lstm in our model like this:

with tf.name_scope('doc_word_encode'): outputs, state = tf.nn.bidirectional_dynamic_rnn( cell_fw=tf.nn.rnn_cell.LSTMCell(self.hidden_size, forget_bias=1.0), #self.hidden_size = 100 cell_bw=tf.nn.rnn_cell.LSTMCell(self.hidden_size, forget_bias=1.0), inputs=inputs, sequence_length=word_in_sen_len, dtype=tf.float32, scope='doc_word' ) outputs = tf.concat(outputs, 2) # [-1, 200]

We can find this LSTMCell in **doc_word_encode**.

We can use code below to check all trainable variables.

#-*- coding: utf-8 -*- import numpy as np import tensorflow as tf np.set_printoptions(threshold=np.inf) model_dataset = 'imdb/1557460934' checkpoint_file = "../checkpoints/"+model_dataset+"/model-4100" init = tf.global_variables_initializer() init_local = tf.local_variables_initializer() with tf.Session() as sess: sess.run([init, init_local]) #load graph saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file)) saver.restore(sess, checkpoint_file) v = [n.name for n in tf.trainable_variables()] for vv in v: print vv

As to forward lstm in bilstm, we can get:

From the result, we can find, there only be a ** kernel** in lstm, it is not eight gate weights, which means only a kernel represents eight gate weights.

## Step 2: Print the shape of lstm kernel.

Get Tensor Variable by Tensor Name

we also can use code below:

v = [n for n in tf.trainable_variables()] for vv in v: if 'doc_word/fw/lstm_cell/kernel:0' in vv.name: print vv

We get this kernel as:

<tf.Variable 'doc_word/fw/lstm_cell/kernel:0' shape=(300, 400) dtype=float32_ref>

## Step 3: Why the shape of lstm kernel is (300,400)?

As forward lstm in our bilstm, the demenison of** x_{t} = 200** and

**self.hidden_size = 100**

It means:

*W _{xi }*

*is*

*200*?, W*_{hi }*is*

*100*? , W*_{xf }*is*

*200*?, W*_{hf }*is*

*100*?,W*_{xo }*is*

*200*?,*

**W**is_{ho }**100*?, W**_{xc }is 200*?, W_{hc }is 100*?Check tensorflow source code.

We will find:

## 1. All weights is named as kernel.

_BIAS_VARIABLE_NAME = "bias" _WEIGHTS_VARIABLE_NAME = "kernel"

## 2.The kernel in BasicLSTMCell(LayerRNNCell) is built by [input_depth + h_depth, 4 * self._num_units]

self._kernel = self.add_variable( _WEIGHTS_VARIABLE_NAME, shape=[input_depth + h_depth, 4 * self._num_units]) self._bias = self.add_variable( _BIAS_VARIABLE_NAME, shape=[4 * self._num_units], initializer=init_ops.zeros_initializer(dtype=self.dtype))

In our model:

## input_depth = 200, self._num_units = 100

so the shape of kernel is (300, 400)

How to get * i, j, f, o*?

# i = input_gate, j = new_input, f = forget_gate, o = output_gate i, j, f, o = array_ops.split( value=gate_inputs, num_or_size_splits=4, axis=one)

In lstm, it calculates * i, j, f, o* by contacting

**and**

*x*_{t}**to generate a (300, ?) demension variable, then set the second demension is**

*h*_{t-1}**self._num_units.**