0% found this document useful (0 votes)
2 views

Deep Learning

The document discusses the evolution of deep learning, highlighting its two initial waves: Neural Networks in the 1980s-1990s and the resurgence in the 2000s-2010s. It emphasizes that deep learning should not be viewed as a brain simulation due to fundamental differences in architecture, learning mechanisms, energy efficiency, and the lack of general intelligence. Additionally, it explains the importance of activation functions in neural networks, stating that without them, the network can only model linear relationships and cannot learn complex patterns.

Uploaded by

Nithesh masam
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Deep Learning

The document discusses the evolution of deep learning, highlighting its two initial waves: Neural Networks in the 1980s-1990s and the resurgence in the 2000s-2010s. It emphasizes that deep learning should not be viewed as a brain simulation due to fundamental differences in architecture, learning mechanisms, energy efficiency, and the lack of general intelligence. Additionally, it explains the importance of activation functions in neural networks, stating that without them, the network can only model linear relationships and cannot learn complex patterns.

Uploaded by

Nithesh masam
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Deep learning

1) The two initial waves of Deep Learning are_________________

and____________

Ans) The two initial waves of Deep Learning are Neural Networks (Connectionism) in the 1980s-
1990s and Deep Learning resurgence in the 2000s-2010s.

2) Why should one not view deep learning as an attempt to simulate the brain?

While deep learning (DL) is inspired by the structure and function of biological neural networks, it
should not be mistaken for an actual brain simulation. Here’s why:

1. Fundamental Differences in Architecture

 Artificial Neural Networks (ANNs) are highly simplified models:

o Biological neurons are complex electrochemical systems, while artificial neurons use
simple mathematical functions (e.g., weighted sums and activation functions).

o The brain operates with billions of interconnected neurons with dynamic synapses,
whereas deep learning models have static and predefined connections.

2. Learning Mechanisms are Different

 The Brain Learns Continuously; ANNs Learn in Batches:

o The human brain learns from few examples (few-shot learning), while deep learning
often requires millions of labeled data points.

o The brain adapts in real-time, whereas deep learning models rely on stochastic
gradient descent (SGD) and batch processing.

 Plasticity vs. Fixed Weights:

o Human brains rewire and adapt dynamically based on experiences.

o Deep learning models have fixed layers and weight structures once trained.

3. Energy Efficiency and Computation Differences

 Brains are far more energy-efficient:

o A human brain operates on about 20 watts of power, while training deep networks
requires massive GPUs consuming kilowatts of energy.

 Parallel vs. Serial Computation:

o The brain processes information in massively parallel ways, while ANNs rely on
sequential matrix computations optimized for GPUs.

4. Lack of General Intelligence (AGI)


Deep learning

 Deep learning models are task-specific (Narrow AI):

o They excel at specific tasks (e.g., image classification, NLP) but lack common sense
reasoning and cross-domain adaptability.

o A trained deep learning model cannot generalize beyond its dataset, while humans
can apply knowledge flexibly across different domains.

5. Biological Mechanisms Not Accounted for

 The brain uses spiking neurons, neuromodulation, and chemical signaling, which are absent
in artificial networks.

 Concepts like emotion, consciousness, and intuition are not represented in deep learning
models.

3). Suppose that ŷ 0.9 and y = 1. What is the value of the logistic loss?

4)'a' is a (4,3) and 'b' is a (1,3) matrix. If we perform c = a * b, what will be the dimensions of
matrix 'c'?

In matrix multiplication, the dimensions must conform to the following rule:

(Am×n)×(Bp×q)(A_{m \times n}) \times (B_{p \times q})


Deep learning

is only valid if n=pn = p, and the resulting matrix will have dimensions:

(m×q)(m \times q)

Given:

 Matrix aa has dimensions (4,3).

 Matrix bb has dimensions (1,3).

Checking Compatibility:

 For c=a∗bc = a * b, the number of columns in aa (which is 3) must match the number of
rows in bb (1).

 Since 3 ≠ 1, this multiplication is not valid using standard matrix multiplication rules.

Alternative Interpretations:

 If ∗* represents element-wise multiplication, bb must be broadcasted across the rows of aa


(assuming bb is implicitly reshaped to (4,3)).

 In that case, the output matrix cc would have dimensions (4,3).

. a) What will happen if we do not use any activation function(s) in a neural network?

If a neural network does not use any activation functions, it essentially reduces to a simple linear
model. Here's why:

1. Loss of Non-Linearity

 Without activation functions, every layer in the network will perform only linear
transformations (i.e., matrix multiplication and addition).

 Stacking multiple linear layers still results in a linear function, meaning the network cannot
learn complex patterns.

 Example:

o Suppose we have two layers: Z1=W1X+b1Z_1 = W_1X + b_1 Z2=W2Z1+b2Z_2 =


W_2Z_1 + b_2

o Mathematically, this can be rewritten as: Z2=(W2W1)X+(W2b1+b2)Z_2 = (W_2


W_1)X + (W_2 b_1 + b_2)
Deep learning

o This is still just a single linear transformation, meaning a deep network would be no
more powerful than a single-layer perceptron.

2. Inability to Learn Complex Patterns

 Most real-world problems (e.g., image recognition, language processing) require non-linear
decision boundaries.

 Without activation functions, the model can only learn linear mappings, making it ineffective
for tasks like image classification or speech recognition.

3. No Universal Approximation

 The Universal Approximation Theorem states that a neural network with at least one non-
linear activation function can approximate any function.

 Without activation functions, the network cannot approximate complex functions.

4. Gradient Flow Issues

 Many activation functions help control the gradient during backpropagation.

 Without them, gradients remain constant across layers, making it difficult for deep networks
to learn efficiently.

Conclusion

If we remove activation functions:


✅ The network can only model linear relationships.
❌ It loses the power to learn complex patterns.
❌ It becomes equivalent to a simple linear regression model, regardless of the number of layers.

Bottom Line: Activation functions introduce non-linearity, which is essential for deep learning.
Without them, a neural network is just a linear model.

You might also like