Highway Networks is proposed in paper: Highway Networks. It is proposed based on LSTM. In this tutorial, we will introduce it for machine learning beginners.

First, we can compare feedforward and recurrent network.

For example:

As to feedward network, the depth of network increases, the gradient may disappear. In order to fix this problem, we can use residual network.

However, as to RNN. We can use lstm to solve the gradient vanishing problem. LSTM use some gates to implement it. Based on this idea, we also can add some gates to deep feedward network to solve gradient vanishing problem.

## The structure of LSTM

The structure of LSTM is below:

The most important gate of LSTM is forget gate.

Understand the Effect of LSTM Input Gate, Forget Gate and Output Gate – LSTM Network Tutorial

It is:

Can we add a forget gate to feedward network to save previous hidden or output?

The answer is Yes.

Understand Long Short-Term Memory Network(LSTM) – LSTM Tutorial

Long Short-Term Memory Network Tutorials and Examples for Beginners

## Highway Networks

Highway Networks adds a forget gate in feedward network to save previous hidden or output. It looks like:

It is defined as:

\(g_T = \sigma(W_Tx + b_T )\)

\(g_C = \sigma(W_Cx + b_C )\)

\(y = x \odot g_C + tanh(Wx + b) \odot g_T\)

or

\(g_T = \sigma(W_Tx + b_T )\)

\(y = x \odot (1-g_T) + tanh(Wx + b) \odot g_T\)

Here \(tanh(Wx+b)\) is a feedward network, \(x\) is the input.

## Notice

Highway Networks is useful to deep and same structure network. Otherwise, it may be worse than feedward network.