0% found this document useful (0 votes)
7 views

02 - Introduction to Convolutional Neural Networks (CNNs)

This document provides an introduction to Convolutional Neural Networks (CNNs), explaining their architecture, including convolutional, pooling, and fully connected layers, as well as activation functions. It also discusses the challenges of deep networks, such as vanishing gradients, and introduces ResNet, a deep learning architecture that addresses these issues through residual learning. Applications of CNNs and ResNet in fields like image classification and medical imaging are highlighted.

Uploaded by

messagetome133
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

02 - Introduction to Convolutional Neural Networks (CNNs)

This document provides an introduction to Convolutional Neural Networks (CNNs), explaining their architecture, including convolutional, pooling, and fully connected layers, as well as activation functions. It also discusses the challenges of deep networks, such as vanishing gradients, and introduces ResNet, a deep learning architecture that addresses these issues through residual learning. Applications of CNNs and ResNet in fields like image classification and medical imaging are highlighted.

Uploaded by

messagetome133
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Lecture 1 to

Introduction
Convolutional Neural
Networks (CNNs)
What is a CNN?
• CNN stands for Convolutional Neural Network

• A special type of neural network designed for image


processing and pattern recognition

• Used in applications like face recognition, medical


imaging, and self-driving cars
Why Use CNNs?

• Traditional machine learning needs manual feature extraction

• CNNs automatically learn features from images

• Better accuracy than traditional methods

• Can handle large-scale datasets


CNN Architecture Overview

• Convolutional Layer – Detects features like edges, colors, and textures

• Pooling Layer – Reduces the size of feature maps

• Fully Connected Layer – Makes final predictions

• Activation Functions – Adds non-linearity


Convolutional Layer
• The main building block of CNN

• Uses filters (kernels) to extract features

• Example:
• Original Image (3x3)
• Kernel (3x3)
• Output (Feature Map)
Convolutional Layer

O = Output size (height/width of the feature map)


I = Input size (height/width of the original image)
K = Filter (kernel) size
P = Padding (number of pixels added around the input)
S = Stride (step size of the filter movement)
How Convolution Works?

• Step 1: Take a small filter (kernel)

• Step 2: Slide it over the image

• Step 3: Multiply values & sum them up

• Step 4: Store the result in the feature map


Padding in CNNs

• Sometimes, the filter shrinks the image

• Padding helps maintain image size

• Types of Padding:
• Valid Padding – No padding (smaller output)
• Same Padding – Keeps the same size
Padding in CNNs
• Valid Padding – No padding (smaller output)
• No extra pixels are added around the input.
• The filter only moves within the original image.
• The output feature map is smaller than the input.
• Example:
• Input: 5×5
• Filter: 3×3
• Stride: 1
• Padding: 0
• 🔹 Output: 3×3 (smaller than input).
Same Padding
• Same Padding – Keeps the same size

• Adds zeros around the input so that the output size = input size (if S=1).

• Ensures the filter can fully process edge pixels.

• Used when we want to maiEffect: Preserves spatial dimensions → No loss of edge features.

• 💡 Example:

• Input: 5×5
• Filter: 3×3
• Stride: 1
• Padding: 1

• 🔹 Output: 5×5 (same as input).ntain spatial dimensions.


Pooling Layer
• Reduces image size to make computation faster

• Types:
• Max Pooling – Takes the highest value in a region
• Average Pooling – Takes the average of the
region
Max Pooling

• Takes the maximum value from each region of the feature map.

• Helps to retain the most significant features.

• Enhances edges and sharp details.

• More commonly used in deep learning models.


Max Pooling
Average Pooling
• Takes the average value from each region of the feature map.
• Helps to smooth out the feature map.
Fully Connected Layer (FC Layer)

• After convolution and pooling, the final features are flattened

• The flattened output is connected to a fully connected layer

• Used for final classification


Activation Functions

• Adds non-linearity to the model

• Common Types:
• ReLU (Rectified Linear Unit): Most common, sets negatives to zero
• Sigmoid: Used for binary classification
• Softmax: Used for multi-class classification
Lecture 2
"ResNet: Deep Residual
Learning for Image Recognition"
Introduction to Deep Learning
1. Briefly explain what deep learning is

2. Mention CNNs (Convolutional Neural Networks)

3. Transition: "As networks get deeper, new challenges arise..."


The Problem with Deep Networks

• Vanishing/exploding gradients

• Degradation problem: accuracy gets worse with deeper


networks

• Image/graph showing performance drop beyond certain


layers
Deep Convolutional Neural Network

• A Deep Convolution Neural Network are the network which


consists of many hidden layer. for examples
• AlexNet: which consist of 8 layer where first 5 were
convlutional layer and last 3 were full connected layer
• VGGNet: which consists of 16 Convolutional layer.

• These architectures are designed to process complex data, such


as images, by leveraging the power of depth and layered feature
extraction.
Deep Convolutional Neural Network

The problem with these deep neural network were as you


increase the layer we start seeing degradation problem.
Or to put it in another word as we increase depth of the network the
accuracy gets saturated and starts degrading rapidly.
In a deep neural network as we perform back-propogation,
repeated mulitplication for finding optimal solution makes
gradient very small which result in degradation.
This problem is often called vanishing gradient/exploding
gradient.
What is ResNet?
• Introduced by Microsoft Research in 2015
• Won ImageNet 2015
• Concept: Residual Learning
• Allows training of networks with 100+ layers
Residual Block Explained
• ResNet solve this degradation problem, is by skipping
connection or layer.

• Skipping connection means,


• consider input x and this input is passed through stack of neural
network layers and produce f(x) and this f(x) is then added to
original input x.
• So our ouput will be:
• Equation: F(x) + x
ResNet Architecture
ResNet Architecture Variants
• ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-
152
• Difference in depth and use of bottleneck blocks
• Table comparing number of layers and parameters
ResNet Architecture Table
Applications of ResNet

• Image classification

• Object detection (used in Faster R-CNN, Mask R-CNN)

• Medical imaging, facial recognition, etc.


Why ResNet Works?

• Solves vanishing gradient with skip connections

• Enables extremely deep networks to converge

• Simpler training, better generalization

You might also like