presentation title
LSTM(LONG SHORT-TERM MEMORY )
What is LSTM?
Long Short-Term Memory is an improved version of recurrent neural network designed by
Hochreiter & Schmidhuber. LSTM is well-suited for sequence prediction tasks and excels in capturing
long-term dependencies.
Its applications extend to tasks involving time series and sequences.
LSTM’s strength lies in its ability to grasp the order dependence crucial for solving intricate
problems,
such as machine translation and speech recognition.
Uses of lstm?
language translation
speech recognition
time series forecasting
LSTMs can also be used in combination with other neural network architectures, such as
Convolutional Neural Networks (CNNs) for image and video analysis.
The memory
The input gate controls what information is added to the
cell is memory cell.
controlled by The forget gate controls what information is removed from
three gates ? the memory cell.
the output gate controls what information is output from
the memory cell.
The input gate This allows LSTM networks to selectively retain or discard
information as it flows through the network, which allows
The forget gate
them to learn long-term dependencies.
The output gate
LSTM architecture has a chain structure that contains
four neural networks and different memory blocks
called cells.
Architecture
and working
of lstm?
Forget Gate Input gate
The equation for the forget gate is:
The equation for the forget gate is:
•W_f represents the weight
•[h_t-1, x_t] the current input and the previous hidden state.
•b_f is the bias with the forget gate. We multiply the previous state by ft, disregarding the information we had previously
•σ is the activation function. chosen to ignore. Next, we include it∗Ct. This represents the updated candidate
values,
Output gate
The equation for the forget gate is:
Advantages Advantages of LSTM :
and They have a memory cell that is capable of long-term information
disadvantages storage.
of lstm ? Can be trained to process sequential data in both forward and
backward directions.
Avoiding Vanishing Gradient Problem
disadvantages of LSTM :
More difficult to train than RNN due to the complexity of the
gates and memory unit
it is hard to parallelize the work of processing the sentences.
Training LSTM networks can be more time-consuming compared
to simpler models due to their computational complexity
LSTMs often requires more data and longer training times to
achieve high performance.
Robot control
Application Time series prediction
of lstm? Speech recognition
Music composition
Grammar learning
Handwriting recognition
Human action recognition
Sign language translation
Video Analysis
LTSM vs RNN
Feature LSTM (Long Short-term Memory) RNN (Recurrent Neural Network)
Has a special memory unit that allows it to learn
Memory long-term dependencies in sequential data
Does not have a memory unit
Can be trained to process sequential data in both Can only be trained to process sequential data in one
Directionality forward and backward directions direction
More difficult to train than RNN due to the
Training complexity of the gates and memory unit
Easier to train than LSTM
Long-term dependency learning Yes Limited
Ability to learn sequential data Yes Yes
Machine translation, speech recognition, text Natural language processing, machine translation,
Applications summarization, natural language processing, time speech recognition, image processing, video
series forecasting processing