Understanding Recurrent Neural Networks

by mourinho   Last Updated January 14, 2018 11:19 AM

I am trying to understand RNNs and then trying to implement them using batch SGD:

I am stuck in understanding how it will work:

RNN has t time steps, (Xi) is the input at each time step:

h is the hidden state

then RNN is just a linear layer with 2 inputs x and a hidden input (i.e 2 matrices):

and if we do:

for x in sequence_length:
    h = linear(x, h)

This actually means recurrence.

  1. What will be the output at the end of this RNN?

  2. I want to understand, how to do batching here? When doing batching what will the input look like?

[batch, seq, embedding_size] or [seq, batch, embedding_size]

  1. And how to extend to multiple layers of RNN? What do multiple layers even mean in case of RNNs... aren't RNN layers just determined by the sequence_length? So, how to implement multiple layer RNNs.

All software packages just don't answer these questions.

All tutorials just start from advanced without explaining these two questions.

Related Questions

Batch Norm & Input Norm Comparisons

Updated June 02, 2017 18:19 PM