Open In App

tf.keras.layers.GRU in TensorFlow

Last Updated : 09 Feb, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

TensorFlow provides an easy-to-use implementation of GRU through tf.keras.layers.GRU, making it ideal for sequence-based tasks such as speech recognition, machine translation, and time-series forecasting.

Gated Recurrent Unit (GRU) is a variant of LSTM that simplifies the architecture by using only two gates:

  1. Update Gate – Determines how much past information should be carried forward.
  2. Reset Gate – Decides how much of the past information should be forgotten.

Unlike LSTMs, GRUs do not have a separate cell state and hidden state, making them computationally more efficient while still retaining the ability to handle long-term dependencies.

Syntax of tf.keras.layers.GRU

tf.keras.layers.GRU(
units,
activation='tanh',
recurrent_activation='sigmoid',
return_sequences=False,
return_state=False,
dropout=0.0,
recurrent_dropout=0.0,
stateful=False,
unroll=False
)

Parameters of tf.keras.layers.GRU

  • units – Number of neurons in the GRU layer.
  • activation – Activation function for the output (default: 'tanh').
  • recurrent_activation – Activation function for recurrent connections (default: 'sigmoid').
  • return_sequences – If True, returns the output for all time steps instead of just the last one.
  • return_state – If True, returns the final hidden state along with the output.
  • dropout – Dropout rate applied to input connections.
  • recurrent_dropout – Dropout rate applied to recurrent connections.
  • stateful – If True, maintains state across batches for stateful GRUs.
  • unroll – If True, unrolls the GRU for faster computation (uses more memory).

How to Use tf.keras.layers.GRU in TensorFlow?

1. Import Required Libraries

Python
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import GRU, Dense
import numpy as np

2. Create Dummy Sequential Data

Python
# Generating sequential data
X = np.random.random((100, 10, 5))
y = np.random.randint(2, size=(100, 1))

3. Build a GRU Model

Python
model = Sequential([
    GRU(64, activation='tanh', return_sequences=True, input_shape=(10, 5)),  # First GRU layer
    GRU(32, activation='tanh'),  # Second GRU layer
    Dense(1, activation='sigmoid')  # Output layer for binary classification
])

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()

Output:

Capture

4. Train the Model

Python
model.fit(X, y, epochs=10, batch_size=16)

Output:

Epoch 1/10
7/7 ━━━━━━━━━━━━━━━━━━━━ 6s 15ms/step - accuracy: 0.5487 - loss: 0.6960
.
.
.
Epoch 9/10
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.6135 - loss: 0.6825
Epoch 10/10
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.6224 - loss: 0.6848
<keras.src.callbacks.history.History at 0x7968ee5983d0>

Understanding return_sequences and return_state

  • return_sequences=True → Returns output for each time step instead of just the last output.
  • return_state=True → Returns the final hidden state along with the output.

Example: Extracting Hidden States

Python
gru_layer = GRU(50, return_sequences=True, return_state=True)
output, hidden_state = gru_layer(tf.random.normal([5, 10, 8]))  # (batch_size=5, time_steps=10, features=8)
print(output.shape, hidden_state.shape)

Output:

(5, 10, 50) (5, 50)

This means:

  • The output contains 50 units for each time step (10) and batch (5).
  • The hidden state has 50 units per batch.

TensorFlow’s tf.keras.layers.GRU is a powerful alternative to LSTMs, offering faster training and fewer parameters while still effectively handling long-term dependencies. GRUs are widely used in NLP, finance, and speech processing tasks.


Next Article

Similar Reads