CS445 - Neural Networks and Deep Learning - Lecture Notes
CS445 - Neural Networks and Deep Learning - Lecture Notes
I. Understanding Backpropagation
Today's lecture focused on the mathematics behind neural network training. The
backpropagation algorithm is fundamental to how neural networks learn from data.
Key Concepts:
1. Forward Propagation
Where:
- L is the loss
- a is the activation
- z is the weighted sum
- w is the weight
layer.gradients = compute_gradients(layer)
1. Sigmoid
- f(x) = max(0, x)
- Derivative: 1 if x > 0, 0 otherwise
- Most commonly used today
3. Tanh
- Range: [-1, 1]
- Often better than sigmoid
- Still has vanishing gradient issues
1. Vanishing Gradients
Solutions:
2. Exploding Gradients
Solutions:
- Gradient clipping
- Batch normalization
- L2 regularization
Homework Assignment
Due next Tuesday:
Additional reading: "Deep Learning" by Goodfellow, Bengio, and Courville - Chapter 6.5
Note: Office hours this week are Wednesday 2-4pm and Thursday 3-5pm in Room 405.