Lecture 02 - Introduction To Neural Networks (Optional)
Lecture 02 - Introduction To Neural Networks (Optional)
Optional
Recognize these emotions
2
Brain vs Machine
3
Man vs Machine?
4
Brain vs Machine
▪ We invented machines …
– To solve problems that we “know how to solve”
– To repeat tasks large number of times in the same way without
getting bored
– Implement what some times referred to as “low–level functions”
▪ Consequently machines
– Follow our instruction with top accuracy
– Tremendously faster in solving well defined problems.
▪ i.e. Multiply / invert matrices, solve differential equations etc.
5
Brain vs Machine
▪ However they are less capable in
–Solving ill‐posed problems or solutions that does not
have any “logic” associated with them
▪ E.g. recognizing a face or a character in a TV show
▪ Why?
–It is difficult for us to teach (or programme) a
computer to do such task!!!
–And in most cases the primary design of machines are
not supportive of doing such tasks
6
Brain vs Computers
Brain Computers
7
Brain vs Computers
Brain Computers
8
Neural Networks Information
9
Neural Networks
▪ Initial concepts date back to the early 1940s.
▪ Became more popular in the 80’s –mostly due to the availability of
required computing power.
▪ Most traditional NNs are models of biological neural networks. But newer
algorithms are more mathematically inspired.
▪ Most NNs have some sort of training rule. In other words, NNs learn from
examples and exhibit some capability for generalization beyond the
training data.
▪ Neural computing must not be considered as a competitor to
conventional computing. Rather, it should be seen as complementary as
the most successful neural solutions have been those which operate in
conjunction with existing, traditional techniques.
10
Neural Network Techniques
▪ Computers have to be explicitly programmed
– We analyze the problem to be solved.
– We write the code in a programming language.
11
Characteristics of Neural Networks
▪ Learning from experience: Complex difficult to solve problems, but with
plenty of data that describe the problem.
▪ Generalizing from examples: Can interpolate from previous learning and
give the correct response to unseen data.
▪ Rapid applications development: NNs are generic machines and quite
independent from domain knowledge.
▪ Adaptability: Adapts to a changing environment, if is properly designed.
▪ Computational efficiency: Although the training off a neural network
demands a lot of computer power, a trained network demands almost
nothing in recall mode.
▪ Non‐linearity: Not based on linear assumptions about the real word.
12
What can you do with an NN and what
not?
▪ In principle, NNs can compute any computable function, i.e., they can do
everything a normal digital computer can do.
– Almost any mapping between vector spaces can be approximated to arbitrary precision
by feedforward NNs
13
Applications of NNs
▪ Classification
– In marketing: consumer spending pattern classification
– In defense: radar and sonar image classification
– In agriculture & fishing: fruit and catch grading
– In medicine: ultrasound and electrocardiogram image
classification, EEGs, medical diagnosis
▪ Recognition and identification
– In general computing and telecommunications: speech, vision and
handwriting recognition
– In finance: signature verification and bank note verification
14
Applications of NNs
▪ Assessment
– In engineering: product inspection monitoring and control
– In defense: target tracking
– In security: motion detection, surveillance image analysis and
fingerprint matching
▪ Forecasting and prediction
– In finance: foreign exchange rate and stock market forecasting
– In agriculture: crop yield forecasting
– In marketing: sales forecasting
– In meteorology: weather prediction
15
More Types of Activation Functions
16
Types of activation functions
▪ Threshold function
1 𝑖𝑓 𝑣 ≥ 0
–𝜑 𝑣 =ቊ
0 𝑖𝑓 𝑣 < 0
– Commonly known as Heaviside function
17
Types of activation functions
▪ Satlin Transfer Function
18
Types of activation functions
▪ Gauss Function
19
Rules of knowledge representation
20
Knowledge representation: Rules
▪ Rule 1: Similar inputs from similar classes should usually produce similar
representations inside the network, and should be classified into the same
class.
– Problem: how do you define similarity?
𝑻
Input Vector 𝒙𝒊 = 𝒙𝒊𝟏 , 𝒙𝒊𝟐 , 𝒙𝒊𝟑 … … … 𝒙𝒊𝒎
– Euclidian distance
𝒎 𝟏ൗ
𝟐
𝟐
𝒅 𝒙𝒊 , 𝒙𝒋 = 𝒙𝒊 − 𝒙𝒋 = 𝒙𝒊𝒌 − 𝒙𝒋𝒌
𝒌=𝟏
21
Knowledge representation: Rules
▪ Rule 1: Similar inputs from similar classes should usually produce similar
representations inside the network, and should be classified into the same
class.
– Problem: how do you define similarity?
𝑻
Input Vector 𝒙𝒊 = 𝒙𝒊𝟏 , 𝒙𝒊𝟐 , 𝒙𝒊𝟑 … … … 𝒙𝒊𝒎
– Dot product
𝒎 Usually we
normalize the
𝒙𝒊 , 𝒙𝒋 = 𝒙𝒊 𝑻 𝒙𝒋 = 𝒙𝒊𝒌 𝒙𝒋𝒌 vectors to have unit
𝒌=𝟏 length.
𝑥𝑖 = 𝑥𝑗 = 1
22
Knowledge representation: Rules
▪ As 𝒙𝒊 approaches 𝒙𝒋
– 𝒙𝒊 − 𝒙𝒋 → 𝟎
– 𝒙𝒊 𝑻 𝒙𝒋 → 𝟏
23
Knowledge representation: Rules
24
Knowledge representation: Rules
25
Knowledge representation: Rules
▪ Rule 4: Prior information and invariances should be built into the design of
the neural network whenever possible.
– This would simplify the design of NN by not having to learn additional information.
▪ Less free parameters to learn
▪ Information transmission is faster
▪ Cost is reduced
26
Knowledge representation: Rules
▪ Rule 4: Prior information and invariances should be built into the design of
the neural network whenever possible.
– How to build prior information into NN?
– Unfortunately, there are no well-defined rules to do this.
– Some rules of thumb:
▪ Restrict the network architecture – usually to local connections called receptive fields
▪ Constrain the choice of synaptic weights – usually achieved through weight sharing
27
Knowledge representation: Invariances
▪ The network should be invariant to trivial transformations of the inputs.
– E.g. rotation of a picture
▪ Techniques:
– Invariance by structure
▪ Pick a structure that isn’t sensitive to the meaningless transformations of the input
– Invariance by training
▪ Let the classifier learn invariances
– Invariance by feature space
▪ pick a feature set that is invariant to the transformations
28