0% found this document useful (0 votes)
2 views

Notes DL-1

The document covers key concepts of neural networks, including their structure, activation functions, and types such as perceptrons and multi-layer networks. It discusses the training process, learning algorithms like gradient descent and backpropagation, and applications in pattern recognition and speech recognition. Additionally, it addresses challenges in neural network development, the relationship between AI, ML, and DL, and model evaluation metrics such as accuracy, precision, recall, and F1 score.

Uploaded by

huzi.ak88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Notes DL-1

The document covers key concepts of neural networks, including their structure, activation functions, and types such as perceptrons and multi-layer networks. It discusses the training process, learning algorithms like gradient descent and backpropagation, and applications in pattern recognition and speech recognition. Additionally, it addresses challenges in neural network development, the relationship between AI, ML, and DL, and model evaluation metrics such as accuracy, precision, recall, and F1 score.

Uploaded by

huzi.ak88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Deep Learning

Lectures
Lecture # 3
Neural Networks
Neural networks are computational models that mimic the behavior of the human brain, capable of
learning from data and making decisions. • Neurons are the basic processing units of neural networks,
interconnected to form layers that process information

Activation Function

Definition: Activation functions are mathematical functions applied to the weighted sum of inputs at
each neuron to determine its output.

 Purpose: Activation functions introduce non-linearity into the network, enabling it to learn complex
patterns and relationships in the data.

 Types of Activation Functions:

1. Sigmoid Function:  S-shaped curve, squashes the input values between 0 and 1.  Commonly used
in the output layer for binary classification tasks.

2. ReLU (Rectified Linear Unit):  Piecewise linear function, outputs the input if it is positive, and zero
otherwise.  Faster convergence and alleviates the vanishing gradient problem.

3. Tanh Function:  Similar to the sigmoid function,but output values range from -1 to 1.  Used in
hidden layers to introduce non-linearity.
Perceptrons

Definition: Perceptrons are single-layer neural networks that can make binary decisions based on a
linear combination of input features.

• Structure: A perceptron consists of: • Input layer: Receives input features. • Weights: Each input
feature is associated with a weight that determines its importance. • Activation function: Decides
whether the perceptron should fire (output a 1) or not (output a 0) based on the weighted sum of
inputs.

Perceptrons can only solve linearly separable problems where a single straight line can be drawn to
separate the classes.

Limitations of single perceptron

Comparison with Complex Architectures: • Perceptrons have limited capabilities compared to more
complex neural network architectures such as multi-layer perceptrons or convolutional neural networks.
• Complex architectures can learn non-linear relationships in data and solve more complex classification
and regression tasks.

Single layer networks

Definition: Single-layer networks consist of one layer of neurons directly connected to the input data. •
Capabilities: Single-layer networks are suitable for simple classification tasks where the classes are
linearly separable.

 Limitations: 1.Inability to Represent Non-linear Relationships: Single-layer networks cannot learn non-
linear relationships in data, limiting their applicability to linearly separable problems. 2.Lack of Hidden
Layers: Without hidden layers, single-layer networks cannot capture complex patterns or hierarchies in
the data.

 Applications: • Single-layer networks are used in basic classification tasks, such as binary classification
problems with linear decision boundaries

Multi-layer feedforward networks

Definition: Multi-layer feedforward networks consist of multiple layers of neurons, including input,
hidden, and output layers.

• Architecture: Hidden layers enable the network to learn complex patterns and relationships in the data
by introducing non-linearity through activation functions.

 Key Components: 1.Input Layer: Receives input data features. 2.Hidden Layers: Intermediate layers
between the input and output layers. Each hidden layer learns progressively more abstract
representations of the input data. 3.Output Layer: Produces the final output based on the learned
representations.

 Advantages: • Capturing Complex Patterns: Multi-layer networks can capture non-linear relationships
in data and solve complex classification and regression tasks. • Hierarchical Representation: Hidden
layers enable the network to learn hierarchical representations of the input data, leading to better
generalization

FeedForward Process

•Forward Pass: During the feedforward process, input data is propagated forward through the network,
layer by layer. •Weighted Sum: At each neuron, the inputs are multiplied by corresponding weights, and
the weighted sum is computed. •Activation: The weighted sum is passed through an activation function
to introduce non-linearity and produce the output of the neuron.

Learning algorithms
Gradient Descent: • Definition: Optimization algorithm that minimizes the loss function by iteratively
adjusting the network parameters in the direction of steepest descent. • Behavior: Computes the
gradient of the loss function with respect to the network parameters and updates the parameters
accordingly. • Suitability: Widely used for training neural networks due to its simplicity and
effectiveness. •

Backpropagation: • Definition: Algorithm for efficiently computing gradients of the loss function with
respect to each parameter in the network. • Behavior: Propagates the error backwards through the
network, allowing for efficient computation of gradients using the chain rule. • Suitability: Essential for
training multi-layer neural networks by efficiently propagating errors and updating weights.

 Key Points: • Gradient descent and backpropagation are fundamental algorithms for training neural
networks. • Gradient descent optimizes the network parameters to minimize the loss function, while
backpropagation efficiently computes gradients for parameter updates. • These algorithms enable
neural networks to learn from data and improve their performance over time

Training process

Training Set: • Dataset used to train the neural network by providing input-output pairs for learning. •
Epoch: • One complete pass through the entire training set. • Multiple epochs may be required for the
network to converge to an optimal solution. •

Batch Training: • Training the network using mini-batches of data samples rather than the entire
dataset. • Enables more efficient computation of gradients and parameter updates.

 Key Points: • The training process involves iteratively adjusting the network parameters using
gradient-based optimization algorithms such as backpropagation. • Multiple epochs of training are
typically required to optimize the network parameters and minimize prediction error. • Batch training
improves the efficiency of gradient computation and parameter updates by using mini-batches of data
samples

Applications of neural networks

Pattern Recognition: • Handwritten digit recognition, facial recognition, object detection, and image
classification are common applications of neural networks in pattern recognition tasks.

• Speech Recognition: • Transcription of spoken language into text, voice-controlled assistants, and
speaker identification utilize neural networks for accurate speech recognition.
• Autonomous Vehicles: • Neural networks play a crucial role in autonomous vehicles for tasks such as
object detection, lane detection, and decisionmaking, enabling safer and more efficient transportation
systems

Challenges in neural network development

Data Limitations: • Insufficient or low-quality data can hinder the performance and generalization ability
of neural networks, leading to suboptimal results.

•Overfitting: • Neural networks may memorize noise in the training data instead of learning meaningful
patterns, resulting in poor performance on unseen data. •

Computational Resources: • Training large-scale neural networks requires significant computational


resources, including high-performance hardware and efficient algorithms
Week # 4
Relationship of AI, ML and DL
● Artificial Intelligence (AI) is anything about man-made intelligence exhibited by machines. ● Machine
Learning (ML) is an approach to achieve AI. ● Deep Learning (DL) is one technique to implement ML

Types of ML Algorithms

● Supervised Learning ○ trained with labeled data; including regression and classification problems

● Unsupervised Learning ○ trained with unlabeled data; clustering and association rule learning
problems.

● Reinforcement Learning ○ no training data; stochastic Markov decision process; robotics and self-
driving cars.

Why Deep Learning?

● Limitations of traditional machine learning algorithms ○ not good at handling high dimensional data. ○
difficult to do feature extraction and object recognition.

● Advantages of deep learning ○ DL is computationally expensive, but it is capable of handling high


dimensional data. ○ feature extraction is done automatically

Convolutional Neural Networks

A convolutional neural network (CNN, or ConvNet) is a class of deep, feed-forward artificial neural
networks that explicitly assumes that the inputs are images, which allows us to encode certain
properties into the architecture.
What is Machine Learning?

Machine Learning is the study of algorithms that • improve their performance P • at some task T • with
experience E

ML is used when: • Human expertise does not exist (navigating on Mars) • Humans can’t explain their
expertise (speech recognition) • Models must be customized (personalized medicine) • Models are
based on huge amounts of data (genomics)

Types of Learning

• Supervised (inductive) learning – Given: training data + desired outputs (labels)

• Unsupervised learning – Given: training data (without desired outputs)

• Semi-supervised learning – Given: training data + a few desired outputs

• Reinforcement learning – Rewards from sequence of actions

Training vs. Test Distribution

• We generally assume that the training and test examples are independently drawn from the same
overall distribution of data – We call this “i.i.d” which stands for “independent and identically
distributed” • If examples are not independent, requires collective classification

If test distribution is different, requires transfer learning

ML in a Nutshell

• Tens of thousands of machine learning algorithms – Hundreds new every year

• Every ML algorithm has three components: – Representation – Optimization – Evaluation

Various Function Representations

• Numerical functions – Linear regression – Neural networks – Support vector machines

• Symbolic functions – Decision trees – Rules in propositional logic – Rules in first-order predicate logic •
Instance-based functions – Nearest-neighbor – Case-based • Probabilistic Graphical Models – Naïve
Bayes – Bayesian networks – Hidden-Markov Models (HMMs) – Probabilistic Context Free Grammars
(PCFGs) – Markov networks

Various Search/Optimization Algorithms • Gradient descent – Perceptron – Backpropagation •


Dynamic Programming – HMM Learning – PCFG Learning • Divide and Conquer – Decision tree induction
– Rule learning • Evolutionary Computation – Genetic Algorithms (GAs) – Genetic Programming (GP) –
Neuro-evolution
Model Evaluation Metrics: Accuracy, Precision, Recall,
F1 Score
Introduction to Model Evaluation

1 Measuring Model Performance: Evaluating model performance is essential for understanding its
strengths, weaknesses, and overall effectiveness.

2 Comparing to Ground Truth: Model predictions are compared to known, correct labels or values to
determine accuracy.

3 Identifying Improvement Areas: Careful analysis of evaluation metrics can reveal opportunities to
refine and optimize the model

Accuracy: Definition and Calculation

Definition: Accuracy is the proportion of correct predictions made by the model out of all predictions.

Calculation: Accuracy = (True Positives + True Negatives) / (True Positives + True Negatives + False
Positives + False Negatives)

Interpretation: Accuracy provides a general overview of model performance but doesn't reveal the full
picture.

Limitations: Accuracy can be misleading in imbalanced datasets or when certain errors are more
important than others

Precision Definition and Calculation

Definition: Precision measures the proportion of true positives among all the positive predictions made
by the model.

Calculation: Precision = True Positives / (True Positives + False Positives)

Interpretation: Precision is useful for evaluating the model's ability to avoid false positive errors

Recall: Definition and Calculation

Definition: Recall measures the proportion of actual positive instances that the model correctly
identified.

Calculation: Recall = True Positives / (True Positives + False Negatives)

Interpretation: Recall is useful for evaluating the model's ability to avoid false negative errors

F1 Score: Definition and Calculation

Definition: The F1 score is the harmonic mean of precision and recall, providing a balanced measure of
model performance.

Calculation: F1 Score = 2 * (Precision * Recall) / (Precision + Recall)


Interpretation: The F1 score ranges from 0 to 1, with 1 indicating perfect balance between precision and
recall. Applications The F1 score is often used in classification tasks where both precision and recall are
important.

Comparing Accuracy, Precision, and Recall

1 Accuracy Provides a general overview of overall model performance.

2 Precision Focuses on the model's ability to avoid false positive errors.

3 Recall Focuses on the model's ability to avoid false negative errors

Tradeoffs Between Metrics

1 High Precision -> Few false positives.

2 High Recall -> Few false negatives.

3 Balance -> Optimal F1 score

2.2 Overfitting & Generalization


Definition Overfitting occurs when a model performs exceptionally well on the training data but fails to
generalize to new, unseen data.

What is Overfitting?

Model Complexity: Too complex for the data.

High Variance: Sensitive to minor data changes

Causes of Overfitting

1 Insufficient Data : Model doesn't see enough examples.

2 Model Complexity: Too many parameters to fit.

3 Noisy Data: Irrelevant features or errors

Dangers of Overfitting

Poor Generalization: Fails on unseen data.

Unreliable Predictions: Inaccurate results.

Wasted Resources: Inefficient model

Bias-Variance Tradeoff

1 High Bias: Underfitting, simple model.

2 Optimal Balance: Good fit, low error.

3 High Variance: Overfitting, complex mode


Techniques to Avoid Overfitting

Data Augmentation: Increase data size.

Feature Selection: Remove irrelevant features.

Dimensionality Reduction: Reduce feature number

Regularization Methods

L1 Regularization: Lasso, feature selection.

L2 Regularization: Ridge, weight reduction

Ensemble Methods

Bagging ,Reduce variance, Boosting, Improve accuracy

Generalization: Definition and Importance

Definition: Generalization refers to a model's ability to perform well on new, unseen data, not just the
data it was trained on.

Importance: Ensuring good generalization is crucial for the real-world applicability and deployment of
machine learning models.

Strategies: Techniques like regularization, cross-validation, and ensemble methods can help improve
model generalization

You might also like