Data Science Machine Learning Data Analysis

Topic: RNN (Recurrent Neural Networks) – Part 2 of 4: Types of RNNs and Architectural Variants

---

1. Vanilla RNN – Limitations

• Standard (vanilla) RNNs suffer from vanishing gradients and short-term memory.

• As sequences get longer, it becomes difficult for the model to retain long-term dependencies.

---

2. Types of RNN Architectures

• One-to-One
Example: Image Classification
A single input and a single output.

• One-to-Many
Example: Image Captioning
A single input leads to a sequence of outputs.

• Many-to-One
Example: Sentiment Analysis
A sequence of inputs gives one output (e.g., sentiment score).

• Many-to-Many
Example: Machine Translation
A sequence of inputs maps to a sequence of outputs.

---

3. Bidirectional RNNs (BiRNNs)

• Process the input sequence in both forward and backward directions.

• Allow the model to understand context from both past and future.

nn.RNN(input_size, hidden_size, bidirectional=True)

---

4. Deep RNNs (Stacked RNNs)

• Multiple RNN layers stacked on top of each other.

• Capture more complex temporal patterns.

nn.RNN(input_size, hidden_size, num_layers=2)

---

5. RNN with Different Output Strategies

• Last Hidden State Only:
Use the final output for classification/regression.

• All Hidden States:
Use all time-step outputs, useful in sequence-to-sequence models.

---

6. Example: Many-to-One RNN in PyTorch

import torch.nn as nn

class SentimentRNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SentimentRNN, self).__init__()
        self.rnn = nn.RNN(input_size, hidden_size, num_layers=1, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        out, _ = self.rnn(x)
        final_out = out[:, -1, :]  # Get the last time-step output
        return self.fc(final_out)

---

7. Summary

• RNNs can be adapted for different tasks: one-to-many, many-to-one, etc.

• Bidirectional and stacked RNNs enhance performance by capturing richer patterns.

• It's important to choose the right architecture based on the sequence problem.

---

Exercise

• Modify the RNN model to use bidirectional layers and evaluate its performance on a text classification dataset.

---

#RNN #BidirectionalRNN #DeepLearning #TimeSeries #NLP

https://yangx.top/DataScienceM

🔥2

1.64K views05:14

About

Blog

Apps

Platform