I am trying to understand RNNs and then trying to implement them using batch SGD:
I am stuck in understanding how it will work:
RNN has t time steps, (Xi) is the input at each time step:
h is the hidden state
then RNN is just a linear layer with 2 inputs x and a hidden input (i.e 2 matrices):
and if we do:
for x in sequence_length: h = linear(x, h)
This actually means recurrence.
What will be the output at the end of this RNN?
I want to understand, how to do batching here? When doing batching what will the input look like?
[batch, seq, embedding_size] or [seq, batch, embedding_size]
All software packages just don't answer these questions.
All tutorials just start from advanced without explaining these two questions.