I'm learning about recurrent neural networks right now, and am in chapter 6 of Deep Learning with Python by Francois Chollet.
In the chapter it's discussing using dropout in recurrent layers. I understand the logic behind having the inputs randomized the same way at each time step since RNN's are used to learn sequence data, but I'm having a difficult time parsing some of the finer details between the
recurrent dropout arguments you can pass in.
Take this simple example:
keras.layers.GRU(32, dropout=0.2, recurrent_dropout=0.2)
Whenever I see snippets like this on the internet both dropout arguments are usually set to the same value. Is this a best habit or just convention?
I'm assuming the
dropout argument is the fraction of inputs that will be zeroed out coming into the recurrent layer. If that's the case, what's the difference between my example and something like this:
keras.layers.Dropout(0.2) keras.layers.GRU(32, recurrent_dropout=0.2)
Thank you for all of your help.
Your two snippets are equivalent. It is just syntactic sugar.
Btw. be careful about using the recurrent dropout. It usually makes things worse.