0% found this document useful (0 votes)
15 views

Lecture 1 - Introduction To NN - CET

Uploaded by

Joel Lim
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Lecture 1 - Introduction To NN - CET

Uploaded by

Joel Lim
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 53

Learning in

Official (Closed) - Non Sensitive

Image
Recognition
(DL)

Lecture 1:
Introduction to
Neural Network

• Specialist Diploma in Applied


Generative AI

• Academic Year 2024/25


Official (Closed) - Non Sensitive

What is deep learning?


 Artificial Intelligence (AI) has been a subject of
intense media hype

 A fundamental of intelligent chatbots, self-


driving cars and virtual assistants

 After taking this course, you should be able to:


1. Differentiate world-changing AI developments from
overhyped press releases
2. Utilize AI agents to solve real world problems
3. Develop AI agents
Official (Closed) - Non Sensitive

Topics
1.Artificial Intelligence, Machine Learning and
Deep Learning

2.Why Deep Learning?

3.What is neural network

4.Classifying Movie Reviews

5.Predicting House Prices


Official (Closed) - Non Sensitive

1. AI, ML and DL
Official (Closed) - Non Sensitive

 Definition of AI: the effort to automate intellectual


tasks normally performed by humans
 AI is a general field that encompasses machine learning
and deep learning, and many other approaches which
don’t involve any learning.
Official (Closed) - Non Sensitive

Personal
• Symbolic AI Computer
1950 (1960s)
s- • (classical programming)
1980
s Internet
(1990s)
• Machine Learning
1990 • (Neural Network,
Social Media
s- Decision Tree) (post 2004)
now

• Deep Learning AlphaGo (2016)


2010 • Generative AI
s-
now
ChatGPT (2022)
Official (Closed) - Non Sensitive

1.1 Symbolic AI (1950s-1980s)


Rules & Data

Classic
Programming

Answer
Official (Closed) - Non Sensitive

1.2 Machine Learning (1990s-


now)
Data &
Answers

Machine
Learning

Rules /
Model
Official (Closed) - Non Sensitive

1.2 Machine Learning


 Three basic components (e.g. image
tagging)

1. Inputs: Input Data points (e.g. image files)

2. Labels: Examples of the expected output


(e.g. cat or dog)

3. Loss Function: A way to measure whether


the algorithm is doing a good job
 The measurement is used as a feedback signal to
adjust the way the algorithm works. This
adjustment step is what we call learning.
Official (Closed) - Non Sensitive

1.2 Machine Learning


 Learning representations from
data
1. meaningfully transform data
2. to learn useful representations of
the input data
3. get us closer to the expected
output

 From Shallow to Deep


1. Shallow Learning
 one or two layers of representations of
the data
2. Deep Learning
 layered representations learning or
Official (Closed) - Non Sensitive

1.3 Deep Learning (2010s-now)


Official (Closed) - Non Sensitive

2. Why Deep Learning? Why Now?


Official (Closed) - Non Sensitive

 What changed in the past two


decades
1. Hardware
2. Datasets and benchmarks
3. Algorithmic advances
(Optimization)

 Will Deep Learning last?


Anything special about DL?
1. Simplicity
2. Scalability
3. Versatility and reusability
Official (Closed) - Non Sensitive

2.1 Real-life Application of Deep



Learning: Images
Detecting COVID-19 in X-ray images with Keras,
TensorFlow, and Deep Learning
Official (Closed) - Non Sensitive

2.2 Real-life Application of Deep


Learning: NLP
 Customer Review Sentiment Analysis
Official (Closed) - Non Sensitive

3. What is neural network


Official (Closed) - Non Sensitive
Official (Closed) - Non Sensitive

3 Intro to neural network


• Did you manage to catch some key words in the
video?
• Input, hidden layers, output
• Neurons
• Weight, bias
• Activation function (threshold function)
• Forward propagation, backpropagation

• Still unclear on exactly how a neural network


functions? Let’s explore more together (and do re-
visit the video again on your own after this
lecture!)
Official (Closed) - Non Sensitive

3.1 Anatomy of a neural network


• Anatomy of a neural Input

network
• Layers: combined into a
model Hidden layers
• Input Data and (containing
neurons/nodes)
corresponding targets /
labels
• Optimizer: determines
how learning proceeds
Predicted Actual
Output Output
Official (Closed) - Non Sensitive

3.1 Anatomy of a neural network

 A neural network is parameterized by its weights


Official (Closed) - Non Sensitive

3.1 Anatomy of a neural network

 A loss function measures the quality of the


network’s output
Official (Closed) - Non Sensitive

3.1 Anatomy of a neural network

 The loss score is used as a feedback signal to


adjust the weights
Official (Closed) - Non Sensitive

3.1 Anatomy of a neural network


• An Example (refer to NN_Computation.xlsx for details)
Input Layer 1 Layer 2 Output
Credit loan
ReLU Sigmoid/logistic
Age Loan approve = 1
Loan reject = 0
Prediction True Target
Y’ Y
Salary
0.5329 1.0

Loss Score
Education = Y- Y’
=0.4671

Forward Propagation
Official (Closed) - Non Sensitive

3.1 Anatomy of a neural network


𝐻 𝑗 =∑ ( 𝑥𝑖 𝑤 𝑖, 𝑗 ) +𝑏 𝑗 𝑂𝑘=∑ ( 𝐻 𝑗 𝑤 𝑗,𝑘 ) +𝑏𝑘
Credit loan Input Hidden Output
𝒘 𝒊, 𝒋 𝒃𝒋 𝒘 𝒋 ,𝒌 𝒃𝒌 Loan approve = 1
=ReLU() =Sigmoid() Loan reject = 0
Age 𝑥1 =0.5 =0.1

=0.6 H1 =0.7 Prediction True Target


Y’ Y
=0.2
Salary 3 =-0.5 O1 =0.53295 1.0
=0.4
=0.9
Loss Score
=0.3 H2
=-0.3 = Y- Y’
Education 2 =0.8 =0.4671

=-0.2 =ReLU(-0.33)=0.0
=0.5*0.1+0.3*0.2+0.2*0.3+(-0.5)=-0.33
=0.5*0.6+0.3*0.4+0.2*0.8+(-0.2)=0.48 =ReLU(0.48)=0.48

=0.0*0.7+0.48*0.9+(-0.3)=0.132
=Sigmoid()=0.53295
Official (Closed) - Non Sensitive

3.1 Anatomy of a neural network


• An Example (refer to NN_Computation.xlsx for details)
Input Layer 1 Layer 2 Output
Credit loan ReLU Sigmoid/logistic

Age 0.2
Loan approve = 1
0.3 Loan reject = 0
0.8 Prediction True Target
0.4 Y’ Y
-0.4
Salary 0.6177 1.0
0.7 1.0

0.5
-0.2 Loss Score Lower loss means the
= Y - Y’ increase in weight is
Education 0.9
=0.3823 effective
0.0

Backpropagation
Assuming all weights are
learned to increase by 0.1
Official (Closed) - Non Sensitive

3.1 Anatomy of a neural network

• Stochastic Gradient Descent (SGD)


• A batch of training sample X and target Y
• Run the network on X to obtain prediction Y’
• Compute the loss L(Y’, Y)
• Compute the gradient of
L(Y’, Y) wrt Weights (W)
• Move the weights a little
W = W- step *gradient Gradient =

• Backpropagation
• Computation of the gradient values of a neural network
Official (Closed) - Non Sensitive

3.2 Introduction to Keras


• Key Features
• Keras is a deep-learning framework for Python
(subsumed into Tensorflow)
• Provides a convenient way to define and train almost
any kind of deep-learning model
• Allows the same code to run seamlessly on CPU or
GPU
• User-friendly API makes it easy to quickly prototype
deep-learning models
• Supports arbitrary network architectures
• Multi-input, multi-output models, layer sharing, model sharing
Official (Closed) - Non Sensitive

3.2 Introduction to Keras


• Developing with Keras
• Define your training data
• input tensors and target tensors
• Define a network of layers (or model)
• maps your inputs to your targets
• Configure the learning process
• loss function, optimizer and metrics to
monitor
• Iterate on your training data
• calling the fit() method of your Model
Official (Closed) - Non Sensitive

3.2 Introduction to Keras


• Two ways to define a model
• Sequential
• Only for linear stacks of layers
• Most common network architecture

• Functional API
• Can build arbitrary architectures
Official (Closed) - Non Sensitive

3.2 Introduction to Keras

 Weight (W) and Bias (b):


1. Tensors / Attributes of the layer
2. Weights / Trainable parameters of the layer
3. Weights (weight and bias) contain the
information learned by the model from
exposure to training data
Official (Closed) - Non Sensitive

3.2 Introduction to Keras


• Compilation Step: configure learning process
• Optimizer What are theses?
• SGD, RMSprop, Adam, and etc.
• Loss Function
• mean_squared_error, mean_absolute_error (“regression”)
• categorical_crossentropy, binary_crossentropy (“classification”)
• Metrics
• mae (“regression”)
• accuracy (“classification”)
Official (Closed) - Non Sensitive

3.2 Introduction to Keras


• Fit Step: learning through training data
• Training data: Numpy arrays
• input_tensor & target_tensor
• Epochs
• Number of iterations
• Batch Size
• Each iteration, fit() iterates over all the samples batch by batch.
• In a run of each batch:
• Forward Pass -> Loss-> Gradient Descent -> Backward Pass -> Update
Weights
Official (Closed) - Non Sensitive

Let’s install Tensorflow in our laptops!


(Demo_0_Hello_World.ipynb)
Official (Closed) - Non Sensitive

 Let’s try out your very first neural


network!

 MNIST (“Hello World” of deep


learning)
1. Classify grayscale images of handwritten
digits (28x28 pixels) into 10 categories
(0 through 9)
2. Dataset: 60,000 training images + 10,000
test images

3. MNIST sample digits

4. Demo in Jupyter Notebook


Official (Closed) - Non Sensitive

Let’s try our hands on training a neural network!


(Demo_1_MNIST.ipynb)

Using a dataset on MNIST handwritten digits, you will train a neural


network model to predict the number shown in the image.
Official (Closed) - Non Sensitive

 Input Data

 Build the model (“network”)


Official (Closed) - Non Sensitive

 Compile

 Training

 Evaluate
Official (Closed) - Non Sensitive

4. Classifying Movie Reviews


Official (Closed) - Non Sensitive

4.1 The IMDB dataset


• 50,000 movie reviews
• Training: 25,000 reviews (half positive & half negative)
• Testing: 25,000 reviews (half positive & half negative)

• Input Data: movie reviews


• A list of words
• Convert each word to a number using a dictionary (key -> value)
• One-hot encode to turn a list of numbers into a vector of 0s and 1s

• Label
• 0: negative
• 1: positive
Official (Closed) - Non Sensitive

4.2 Building the Model


• Dense Layer
• Fully connected
• Activations:
Sigmoid ReLU

• Units: number of hidden units

Practical 1a Demo
Official (Closed) - Non Sensitive

4.2 Building the Model


Official (Closed) - Non Sensitive

Let’s try our hands on training a neural network!


(Practical 1a)

Using movie reviews data from IMDB dataset, you will train a neural
network model to predict if a new movie review is positive or negative.
Official (Closed) - Non Sensitive

4.3 The full process of Practical 1a


• Import training and testing data
• Prepare the data to be ready as input
• Build the model
• Train the model using training data
• Validate the model using validation data
• Plot the training and validation loss and accuracy
• Analyze the results: overfitting or underfitting
• Use model to make predictions for testing data
Official (Closed) - Non Sensitive

4.4 Wrapping Up
• Preprocessing Raw data to become tensors, feeding into neural
network
• Stacks of Dense layers with ReLU activations can solve a wide range
of problems
• In a binary classification problem, the model should end with a
Sigmoid unit to output a probability
• The RMSprop optimizer is a good enough choice
• Neural networks eventually start overfitting on the training data so
always monitor performance on data outside of training set.
Official (Closed) - Non Sensitive

5. Predicting House Prices


Official (Closed) - Non Sensitive

5.1 Dataset
• A total of 506 samples (404 training & 102 testing)
• Each Sample has:
• Input Data (13 features): crime rate, residential proportion,
distances to centers, pupil-teacher ratio, accessibility to highways
and etc.
• Target: the median price of home in this area

• Each Feature in the input data has a different scale


• Make neural network learning more difficult
• Feature wise normalization is required
(subtract each value by its mean and then divided by STDev)
Official (Closed) - Non Sensitive

5.2 Building the model


Official (Closed) - Non Sensitive

Let’s try our hands on training another neural network!


(Practical 1b)

Using a dataset on housing prices, you will train a


neural network model to predict the median price of
homes in a given Boston suburb in mid-1970s.
Official (Closed) - Non Sensitive

5.3 Wrapping Up
• Regression is done using different loss function, e.g. Mean
Squared Error (MSE)
• Evaluation metrics, Mean Absolute Error (MAE), is most
commonly used for regression tasks
• When input features have values in different ranges, we
should normalize the data
Official (Closed) - Non Sensitive

This week wrap up


 AI is a general field that encompasses machine
learning and deep learning
 ML (including DL) has three basic components:
Inputs, Labels and Loss Function.
 DL is to learn useful representations of the input
data through multiple layered networks
Official (Closed) - Non Sensitive

(Optional) Let’s try our hands on Tensor Operations!


(Practical 0)

Self practice to understand how tensor operation


works.
Official (Closed) - Non Sensitive

Q&A
Official (Closed) - Non Sensitive

References
Books:

François Chollet, Deep Learning with Python (2018)

Online Resources:

https://en.wikipedia.org/wiki/Artificial_neural_network

You might also like