Build Custom RNN By Inheriting RNNCell in TensorFlow – TensorFlow Tutorial

By | March 31, 2022

LSTM, GRU are RNNs. In this tutorial, we will introduce how to build a custom one by inheriting RNNCell if you plan to build a new type of RNN.

We also can build a custom RNN and do not use RNNCell, here is the tutorial:

Build Your Own LSTM Model Using TensorFlow: Steps to Create a Customized LSTM – TensorFlow Tutorial

Build a Custom BiLSTM Model Using TensorFlow: A Step Guide – TensorFlow Tutorial

How to inherit RNNCell?

The structure of a custom rnn inheriting RNNCell is following:

import tensorflow as tf

class CustomCell(tf.nn.rnn_cell.RNNCell):
	def __init__(self, num_units, reuse = None, name = "custom_cell"):
		super(CustomCell, self).__init__(_reuse=reuse, name=name)

		self._num_units = num_units # the dimension of rnn cell

	@property
	def state_size(self):
		return self._num_units

	@property
	def output_size(self):
		return self._num_units

	def build(self, inputs_shape):
		#inputs_shape is batch_size * dim
		#for example: the inputs is batch_size * timestep * dim
		#inputs_shape = [batch_size, dim]
		#we can create some variables in this method, and these variables can be used in call()
		self.built = True

	def call(self, inputs, state):

		# call body
		# how to use previous rnn cell output and state to generate new output and hidden
		new_h = inputs
		new_c = state
		return new_h, new_c

We should notice two important methods:

build(): the parameter is inputs_shape, we should create some variables in this function. This function is run before call() function is run.

We should set self.built = True.

call(): parameters are inputs and state, they are previous rnn cell output and state, we will generate current output and state in this current rnn cell.

We can find there are 2 parameters in call(), it also return two parameters.

We also can evaluate this custom rnn as follows:

size = 100
inputs = tf.Variable(tf.truncated_normal([3, 20, 100], stddev=0.1), name="inputs")
input_lengths = tf.Variable(tf.truncated_normal([3, 20], stddev=0.1), name="inputs_length")

_fw_cell =CustomCell(size, name='encoder_fw_')
_bw_cell =CustomCell(size, name='encoder_bw')
with tf.variable_scope("Custom_BiLSTM"):
    outputs, (fw_state, bw_state) = tf.nn.bidirectional_dynamic_rnn(
		_fw_cell,
		_bw_cell,
		inputs,
		sequence_length=None,
		dtype=tf.float32,
		swap_memory=True)

    outputs = tf.concat(outputs, axis=2)  # Concat and return forward + backward outputs

init = tf.global_variables_initializer()
init_local = tf.local_variables_initializer()
with tf.Session() as sess:
    sess.run([init, init_local])
    np.set_printoptions(precision=4, suppress=True)
    f =sess.run([inputs, outputs])

Run this code, we will see:

f shape= (3, 20, 100) (3, 20, 200)
[array([[[ 0.1194, -0.1482, -0.0366, ...,  0.0245,  0.0814, -0.0542],
        [ 0.0706,  0.0103,  0.14  , ..., -0.1292,  0.1306,  0.0329],
        [ 0.0911,  0.0186,  0.0709, ..., -0.0629, -0.1679,  0.0624],
        ...,
        [ 0.0242,  0.1694, -0.1566, ...,  0.0322,  0.0864, -0.0159],
        [ 0.1084,  0.0702,  0.0162, ...,  0.0331,  0.0174,  0.0541],
        [-0.103 ,  0.006 , -0.0532, ...,  0.0865, -0.0875, -0.0121]]],
      dtype=float32), array([[[ 0.1194, -0.1482, -0.0366, ...,  0.0245,  0.0814, -0.0542],
        [ 0.0706,  0.0103,  0.14  , ..., -0.1292,  0.1306,  0.0329],
        [ 0.0911,  0.0186,  0.0709, ..., -0.0629, -0.1679,  0.0624],
        ...,
        [ 0.0242,  0.1694, -0.1566, ...,  0.0322,  0.0864, -0.0159],
        [ 0.1084,  0.0702,  0.0162, ...,  0.0331,  0.0174,  0.0541],
        [-0.103 ,  0.006 , -0.0532, ...,  0.0865, -0.0875, -0.0121]]],
      dtype=float32)]

Improve Custom RNN

We can create some variables in build() function to improve custom RNN, for example:

	def build(self, inputs_shape):
		print(inputs_shape)
		print(type(inputs_shape))
		#inputs_shape is batch_size * dim
		#for example: the inputs is batch_size * timestep * dim
		#inputs_shape = [batch_size, dim]
		#we can create some variables in this method, and these variables can be used in call()
		inputs_dim = inputs_shape[-1].value
		self._w = self.add_variable(name="weight", shape = [inputs_dim, self._num_units], initializer = tf.glorot_normal_initializer(), dtype = tf.float32)
		self._b = self.add_variable(name="bias", shape=[self._num_units], initializer=tf.glorot_normal_initializer(), dtype=tf.float32)

		self.built = True

You should notice: we can not use tf.Variable() in build() to create tensorflow variables, otherwise, you will get a value error:

Fix Inheriting RNNCell build() ValueError - Initializer for variable is from inside a control-flow construct - TensorFlow Tutorial

To fix this error, you can view:

Fix Inheriting RNNCell build() ValueError: Initializer for variable is from inside a control-flow construct – TensorFlow Tutorial

Then we can evalute it and will get the result:

f shape= (3, 20, 100) (3, 20, 200)
[array([[[ 0.1409,  0.107 ,  0.0258, ...,  0.0281, -0.0612,  0.0525],
        [ 0.0652, -0.0202,  0.0169, ..., -0.1956,  0.0543, -0.0334],
        [-0.0559,  0.1613,  0.0257, ...,  0.0858,  0.1105, -0.0963], 
        ...,
        [-0.0716, -0.0563, -0.0451, ...,  0.1238, -0.0111, -0.0465],
        [-0.0484,  0.0344, -0.0566, ..., -0.1707, -0.0705,  0.01  ],
        [-0.0037, -0.0209, -0.0565, ...,  0.0233,  0.0548,  0.1174]]],
      dtype=float32), array([[[-0.0311, -0.246 , -0.0412, ..., -0.0077, -0.0249,  0.0986],
        [-0.1012, -0.1396, -0.0219, ..., -0.0893,  0.128 ,  0.1197],
        [ 0.0678, -0.2054, -0.0608, ..., -0.0068, -0.0627,  0.1589],
        ...,
        [ 0.0128, -0.1721,  0.0594, ...,  0.1566,  0.1048, -0.104 ],
        [-0.0187, -0.2965,  0.0667, ..., -0.0013,  0.0132,  0.046 ],
        [ 0.0082, -0.1131,  0.0417, ...,  0.068 ,  0.2191,  0.0546]]],
      dtype=float32)]

Leave a Reply

Your email address will not be published. Required fields are marked *