In deep learning, we usually place a dropout layer after a dense layer. However, here is a problem? Dropout layer is placed before or after activation function.
Dropout vs non-linear activation function
we can find:
If the activation function is non-linear activation function, we should place dropout layer after the activation function.
Dropout vs linear activation function
However, if the activate function is linear, such as relu.
From page: https://sebastianraschka.com/faq/docs/dropout-activation.html, we can find:
(a): Fully connected, linear activation -> ReLU -> Dropout -> … (b): Fully connected, linear activation -> Dropout -> ReLU -> …
The results are the same, which means dropout layer can be placed before or after relu activation function.
To implement dropout layer, you can read: