Advanced LSTM is a variation of LSTM, which is proposed in paper ADVANCED LSTM: A STUDY ABOUT BETTER TIME DEPENDENCY MODELING IN EMOTION RECOGNITION. In this tutorial, we will compare it with Conventional LSTM, which will help us to understand it.

## Advanced LSTM

The structure of advanced lstm is:

We can find the output \(C(t+1)\) and \(h(t+1)\) of \(t+1\) step are computed based the ouputs of previous 3 steps, which is the main difference between advanced lstm and conventional lstm.

## Difference between advanced lstm and conventional lstm

The output of conventional lstm is computed based on previous step. However, the advanced lstm is based on previous **T** steps. For example, T = 3.

The equation of advanced lstm as follows:

Because we will use previous T steps to compute current output in advaced lstm, we should determine each weight of per previous step, which means we will use two attention layers to compute \(C’\) and \(h’\).

## Warning

If you plan to use advanced lstm to build your model, you must notice:

- Weight \(W\) is the same in \(W_{h_T}\) and \(W_{C_T}\)
- You can not compute \(C’\) and \(h’\) in each step. You should compute them every 3 or 4 steps. You may get a worse result if you compute \(C’\) and \(h’\) in each step. Because advaced lstm will capture more duplicated contex in that situation.