Highway LSTM is a variants of LSTM, it adds highway networks inside an LSTM. In this tutorial, we will introduce it for LSTM beginners.

## Highway Networks

Highway LSTM integrates highway networks in lstm, in order to understan it, you should learn what is highway network. Here is an tutorial:

A Beginner Introduction to Highway Networks – Machine Learning Tutorial

## Highway LSTM

Highway LSTM is proposed in paper: LANGUAGE MODELING WITH HIGHWAY LSTM

There are three kinds of high lstms, we will introduce them one by one.

## HW-LSTM-C

HW-LSTM-C is defined as:

It adds a highway network to the previous state \(c_{t-1}\) of lstm.

Here tanh() is a feedward network.

## HW-LSTM-H

HW-LSTM-H is defined as:

Similar to HW-LSTM-C, it adds a highway network to the output \(h_{t}\) of lstm.

## HW-LSTM-CH

HW-LSTM-CH combines HW-LSTM-C and HW-LSTM-H, it is defined as:

## Which highway lstm has good performance?

As to experiments in this paper, we can find:

- HW-LSTM-C is almost same to baseline LSTM

It means it is not useful to add a feedword network to \(c\) of lstm.

- HW-LSTM-H has the best performance

It means it is useful to add a feedword network to output \(h\) of lstm.

However, this paper does not compare LSTMP, we can not be sure the efficiency of HW-LSTM-H is caused by highway network or the tanh() projection.

Understand LSTMP (LSTM with Recurrent Projection Layer): Comparing with LSTM