# Layer Normalization Explained for Beginners – Deep Learning Tutorial

By | May 24, 2021

Layer Normalization is proposed in paper “Layer Normalization” in 2016, which aims to fix the problem of the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks. In this tutorial, we will introduce what is layer normalization and how to use it.

## Layer Normalization

Layer Normalization is defined as:

$$y_i=\lambda(\frac{x_i-\mu}{\sqrt{\sigma^2+\epsilon}})+\beta$$

It is similar to batch normalization. However, as to input $$x$$, the normalize axis is different.

Here is an example to normalize the output of BiLSTM using layer normalization.

Normalize the Output of BiLSTM Using Layer Normalization

## How to implement layer normalization in tensorflow?

There are two ways to implement:

We will use an example to show you how to do.

import tensorflow as tf

x1 = tf.convert_to_tensor(
[[[18.369314, 2.6570225, 20.402943],
[10.403599, 2.7813416, 20.794857]],
[[19.0327, 2.6398268, 6.3894367],
[3.921237, 10.761424, 2.7887821]],
[[11.466338, 20.210938, 8.242946],
[22.77081, 11.555874, 11.183836]],
[[8.976935, 10.204252, 11.20231],
[-7.356888, 6.2725096, 1.1952505]]])

mean_x, std_x = tf.nn.moments(x1, axes = 2, keep_dims=True)

v1 = tf.nn.batch_normalization(x1, mean_x, std_x, None, None, 1e-12)

v2 = tf.contrib.layers.layer_norm(inputs=x1, begin_norm_axis=-1, begin_params_axis=-1)
with tf.Session() as sess1:
sess1.run(tf.global_variables_initializer())
print(sess1.run(v1))
print(sess1.run(v2))

In this code, v1 is computed by tf.nn.batch_normalization(), v2 is computed by tf.contrib.layers.layer_norm(), we can find the results are the same.

[[[ 0.574993   -1.4064413   0.8314482 ]
[-0.12501884 -1.1574404   1.2824591 ]]

[[ 1.3801125  -0.95738953 -0.422723  ]
[-0.5402142   1.4019756  -0.86176133]]

[[-0.36398554  1.3654773  -1.0014919 ]
[ 1.4136491  -0.67222667 -0.7414224 ]]

[[-1.2645674   0.08396816  1.1806011 ]
[-1.3146634   1.108713    0.20595042]]]
[[[ 0.574993   -1.4064413   0.8314482 ]
[-0.12501884 -1.1574404   1.2824591 ]]

[[ 1.3801125  -0.95738953 -0.422723  ]
[-0.5402142   1.4019756  -0.86176133]]

[[-0.36398554  1.3654773  -1.0014919 ]
[ 1.4136491  -0.67222667 -0.7414224 ]]

[[-1.2645674   0.08396816  1.1806011 ]
[-1.3146634   1.108713    0.20595042]]]

As to tf.contrib.layers.layer_norm() source code, we can find:

tf.contrib.layers.layer_norm() calls tf.nn.batch_normalization() to normalize a layer.

