Understand the Effect of LSTM Input Gate, Forget Gate and Output Gate – LSTM Network Tutorial

By | July 20, 2020

LSTM network contains three gates: input gate, forget gate and output gate. The structure of lstm gates is below:

There is a problem: what is the effect of each gate in LSTM? which one is the most important gate in LSTM network?

In this tutorial, we will discuss this topic.

In paper:

An Empirical Exploration of Recurrent Network Architectures

We can find:

Forget gate:

The forget gate turns out to be of the greatest importance. When the forget gate is removed, the LSTM exhibits drastically inferior performance on the ARITH and the XML problems, although it is relatively unimportant in language modelling, consistent with Mikolov et al. (2014)

Which means lstm forget gate is the most important gate in lstm network, it will affect the performance of lstm greatly.

adding a positive bias to the forget gate greatly improves the performance of the LSTM.

Which means we should add a forget bias for lstm forget gate.