0% found this document useful (0 votes)

2 views

Introduction to ANN with steps 10 25

Uploaded by

Uppli Raj

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Introduction to ANN with steps 10 25

Uploaded by

Uppli Raj

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Introduction to Neural Networks

A neural network is a computational system inspired by the human brain, designed to

recognize patterns and relationships in data. It is a foundational technology in machine
learning and deep learning, frequently used in image recognition, natural language
processing, and predictive analytics.

Key Components of a Neural Network

1. Input Layer:

o Receives input features for processing.

2. Hidden Layers:

o Intermediate layers that transform inputs into something the output layer can
use.

3. Output Layer:

o Produces the final prediction or classification.

4. Weights and Biases:

o Weights: Influence the strength of the connection between neurons.

o Bias: Allows shifting the activation function to better fit the data.

5. Activation Functions:

o Introduce non-linearity to the model, enabling it to learn complex patterns.

o Common functions: Sigmoid, ReLU, Tanh, Softmax.

Working Example: Neural Network to Classify Handwritten Digits

We'll build a simple neural network using the popular Keras library to classify handwritten
digits from the MNIST dataset.

Step-by-Step Process

1. Load and preprocess data.

2. Define the neural network architecture.

3. Compile the model.

4. Train the model.

5. Evaluate and visualize performance.

Sample Code:

python

Copy code

import tensorflow as tf

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense, Flatten

from tensorflow.keras.datasets import mnist

from tensorflow.keras.utils import to_categorical

import matplotlib.pyplot as plt

# Load and preprocess the data

(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixel values to range [0, 1]

x_train = x_train / 255.0

x_test = x_test / 255.0

# One-hot encode target labels

y_train = to_categorical(y_train)

y_test = to_categorical(y_test)

# Define the neural network architecture

model = Sequential([

Flatten(input_shape=(28, 28)), # Flatten the 2D images into 1D vectors

Dense(128, activation='relu'), # Hidden layer with 128 neurons

Dense(10, activation='softmax') # Output layer for 10 classes (digits 0-9)

])

# Compile the model

model.compile(optimizer='adam',

loss='categorical_crossentropy',

metrics=['accuracy'])

# Train the model

history = model.fit(x_train, y_train, epochs=5, batch_size=32, validation_split=0.2)

# Evaluate the model

test_loss, test_accuracy = model.evaluate(x_test, y_test)

print(f"Test accuracy: {test_accuracy:.4f}")

# Plot training accuracy and loss

plt.figure(figsize=(12, 4))

plt.subplot(1, 2, 1)

plt.plot(history.history['accuracy'], label='Training Accuracy')

plt.plot(history.history['val_accuracy'], label='Validation Accuracy')

plt.legend()

plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)

plt.plot(history.history['loss'], label='Training Loss')

plt.plot(history.history['val_loss'], label='Validation Loss')

plt.legend()

plt.title('Training and Validation Loss')

plt.show()

Model Architecture Diagram

The diagram for this neural network structure:

1. Input Layer: Accepts 28x28 grayscale images.

2. Hidden Layer: 128 neurons with ReLU activation.

3. Output Layer: 10 neurons with softmax activation.

Expected Output

• A plot showing Training Accuracy/Loss and Validation Accuracy/Loss over epochs.

• Printed test accuracy value.

• Visual representation of the architecture and activation progression.

Would you like a diagram generated for this architecture?

You said:

yes

ChatGPT said:

ChatGPT

Analysis errored

python

Always show details

Copy code

import matplotlib.pyplot as plt

import numpy as np

# Generate architecture diagram for neural network example

fig, ax = plt.subplots(figsize=(10, 6))

# Define layer node positions

layer_sizes = [784, 128, 10] # Input (784 nodes), Hidden (128 nodes), Output (10 nodes)

layer_labels = ["Input Layer\n(Flattened 28x28)", "Hidden Layer\n128 Neurons (ReLU)",

"Output Layer\n10 Classes (Softmax)"]

layer_colors = ["lightblue", "lightgreen", "salmon"]

spacing = 3

# Plot nodes for each layer

for i, (size, label, color) in enumerate(zip(layer_sizes, layer_labels, layer_colors)):

x_pos = spacing * i

y_positions = np.linspace(-1, 1, size)

ax.scatter([x_pos] * size, y_positions, color=color, label=label, alpha=0.6,

edgecolors='black')

ax.text(x_pos, 1.2, label, ha='center', fontsize=10, bbox=dict(facecolor='white', alpha=0.6,

edgecolor='black'))

# Example arrow connections between layers

ax.annotate('', xy=(spacing, 0.9), xytext=(0, 0.9), arrowprops=dict(arrowstyle="->", lw=1.5))

ax.annotate('', xy=(2*spacing, 0.8), xytext=(spacing, 0.8), arrowprops=dict(arrowstyle="->",

lw=1.5))

# Annotations for better understanding

ax.text(spacing, 0.5, "Weights & Bias Adjustments", fontsize=9, ha='center',

bbox=dict(facecolor='white', alpha=0.7))

# Aesthetic adjustments

ax.set_xlim(-1, 2 * spacing + 1)

ax.set_ylim(-1.5, 1.5)

ax.axis('off')
# Title

plt.title("Neural Network Architecture for MNIST Classification", fontsize=12)

plt.show()

Analyzed

python

Always show details

Copy code

import matplotlib.pyplot as plt

import numpy as np

# Generate architecture diagram for neural network example

fig, ax = plt.subplots(figsize=(10, 6))

# Define layer node positions

layer_sizes = [784, 128, 10] # Input (784 nodes), Hidden (128 nodes), Output (10 nodes)

layer_labels = ["Input Layer\n(Flattened 28x28)", "Hidden Layer\n128 Neurons (ReLU)",

"Output Layer\n10 Classes (Softmax)"]

layer_colors = ["lightblue", "lightgreen", "salmon"]

spacing = 3

# Plot nodes for each layer

for i, (size, label, color) in enumerate(zip(layer_sizes, layer_labels, layer_colors)):

x_pos = spacing * i

y_positions = np.linspace(-1, 1, size)

ax.scatter([x_pos] * size, y_positions, color=color, label=label, alpha=0.6,

edgecolors='black')
ax.text(x_pos, 1.2, label, ha='center', fontsize=10, bbox=dict(facecolor='white', alpha=0.6,
edgecolor='black'))

# Example arrow connections between layers

ax.annotate('', xy=(spacing, 0.9), xytext=(0, 0.9), arrowprops=dict(arrowstyle="->", lw=1.5))

ax.annotate('', xy=(2*spacing, 0.8), xytext=(spacing, 0.8), arrowprops=dict(arrowstyle="->",

lw=1.5))

# Annotations for better understanding

ax.text(spacing, 0.5, "Weights & Bias Adjustments", fontsize=9, ha='center',

bbox=dict(facecolor='white', alpha=0.7))

# Aesthetic adjustments

ax.set_xlim(-1, 2 * spacing + 1)

ax.set_ylim(-1.5, 1.5)

ax.axis('off')

# Title

plt.title("Neural Network Architecture for MNIST Classification", fontsize=12)

plt.show()

STDOUT/STDERR

/tmp/ipykernel_12/2053351591.py:17: UserWarning: You passed a edgecolor/edgecolors

('black') for an unfilled marker ('x'). Matplotlib is ignoring the edgecolor in favor of the
facecolor. This behavior may change in the future.

ax.scatter([x_pos] * size, y_positions, color=color, label=label, alpha=0.6,

edgecolors='black')

Here is a graphical representation of a simple neural network architecture used for

classifying the MNIST dataset:
• Input Layer: Takes in 784 features (flattened 28x28 pixel grid for grayscale images).

• Hidden Layer: 128 neurons with ReLU activation function.

• Output Layer: 10 neurons representing digits (0-9) with Softmax activation function.

Weights and biases are adjusted during training to minimize prediction error. Arrows
represent connections with trainable weights.

step-by-step example demonstrating the complete cycle of a neural network built using
TensorFlow and Keras to classify handwritten digits (MNIST dataset).

Neural Network Example: Classifying Handwritten Digits

1. Import Libraries

python

Copy code

import numpy as np

import tensorflow as tf

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense, Flatten

from tensorflow.keras.datasets import mnist

from tensorflow.keras.utils import to_categorical

import matplotlib.pyplot as plt

2. Load and Preprocess Data

python

Copy code

# Load data

(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Normalize pixel values to be between 0 and 1

X_train = X_train / 255.0

X_test = X_test / 255.0

# One-hot encode target labels

y_train = to_categorical(y_train)

y_test = to_categorical(y_test)

3. Define the Neural Network Model

python

Copy code

# Build the neural network

model = Sequential([

Flatten(input_shape=(28, 28)), # Flatten input image to 1D vector

Dense(128, activation='relu'), # Hidden layer with 128 neurons and ReLU activation

Dense(10, activation='softmax') # Output layer for 10 classes (digits 0-9)

])

# Compile the model

model.compile(optimizer='adam',

loss='categorical_crossentropy',

metrics=['accuracy'])

4. Train the Model

python

Copy code

# Train the model on training data

history = model.fit(X_train, y_train, epochs=5, batch_size=32, validation_split=0.2)

5. Evaluate the Model

python

Copy code

# Evaluate the model on test data

test_loss, test_accuracy = model.evaluate(X_test, y_test)

print(f"Test Accuracy: {test_accuracy:.2f}")

6. Make Predictions

python

Copy code

# Make predictions on a sample test image

sample_image = X_test[0]

predicted_label = np.argmax(model.predict(sample_image.reshape(1, 28, 28)), axis=1)

actual_label = np.argmax(y_test[0])

print(f"Predicted Label: {predicted_label[0]}")

print(f"Actual Label: {actual_label}")

7. Visualize Sample Prediction

python

Copy code

plt.imshow(sample_image, cmap='gray')

plt.title(f"Predicted: {predicted_label[0]}, Actual: {actual_label}")

plt.axis('off')

plt.show()

Output Explanation

• Input Layer: Processes 28x28 images flattened into 784 features.

• Hidden Layer: Detects patterns in pixel values using 128 neurons.

• Output Layer: Provides probabilities for each of the 10 possible digits.

Example Diagram Representation

The network works by gradually adjusting weights and biases through backpropagation to
predict correct labels for unseen data.

Introduction to Neural Networks

A neural network is a computational model inspired by the way biological neural networks in
the human brain function. It consists of layers of interconnected "neurons" (nodes) that
process input data and adjust internal parameters (weights and biases) to learn patterns.

Key Components of a Neural Network

1. Input Layer:

o Accepts input features from data (e.g., pixel values in images).

o Each neuron represents one input feature.

2. Hidden Layers:

o Perform computations to learn patterns in the data.

o Can have multiple hidden layers, with each layer learning increasingly
complex features.

3. Output Layer:

o Provides the final predictions or classifications.

o Number of neurons in this layer matches the number of possible output

classes.

4. Weights:

o Parameters that connect neurons between layers.

o Adjusted during training to minimize error.

5. Biases:

o Constants added to the weighted sum of inputs before applying the activation
function.

o Helps the model shift activation functions and better fit the data.

6. Activation Functions:
o Introduce non-linearity into the network, allowing it to learn complex
patterns.

o Common activation functions include:

Types of Activation Functions

Activation
Mathematical Form Usage Graph Characteristics
Function

f(x)=11+e−xf(x) = \frac{1}{1 + e^{- Binary Smooth curve from 0

Sigmoid
x}}f(x)=1+e−x1 classification to 1

Tanh f(x)=ex−e−xex+e−xf(x) = \frac{e^x - Outputs

Centered at 0,
(Hyperbolic e^{-x}}{e^x + e^{- between -1 and
smoother transitions
Tangent) x}}f(x)=ex+e−xex−e−x 1

Linear for positive

ReLU (Rectified f(x)=max⁡(0,x)f(x) = \max(0, Hidden layers in
values, zero for
Linear Unit) x)f(x)=max(0,x) deep learning
negative

f(x)=xf(x) = xf(x)=x if x>0x > 0x>0; Address

Allows small negative
Leaky ReLU otherwise f(x)=αxf(x) = \alpha vanishing
slope
xf(x)=αx gradient

f(xi)=exi∑exjf(x_i) =
Multi-class Outputs probabilities
Softmax \frac{e^{x_i}}{\sum e^{x_j}}f(xi
classification for each class
)=∑exjexi

Neural Network Example: Types of Functions

1. Input Layer:

o Inputs: Age, Income, Student Status, Credit Rating

o Example Inputs: [25, "High", "Yes", "Fair"]

2. Hidden Layer 1:

o 128 neurons with ReLU activation function.

o Weights determine how much each input contributes to the output.

3. Hidden Layer 2:

o 64 neurons with Tanh activation function.

o Biases shift the activation to better capture patterns in complex data.

4. Output Layer:

o 10 neurons for digit classification with Softmax activation.

Training Process

1. Forward Propagation:

o Inputs pass through each layer, activating neurons and generating outputs.

2. Loss Calculation:

o Compares predicted outputs with actual labels.

3. Backward Propagation:

o Adjusts weights and biases to reduce error using gradients.

4. Optimization:

o Techniques like Stochastic Gradient Descent (SGD) minimize the loss.

Types of Neural Networks

1. Feedforward Neural Networks (FNNs):

o Data flows in one direction (no loops).

2. Convolutional Neural Networks (CNNs):

o Best suited for image processing.

3. Recurrent Neural Networks (RNNs):

o Captures temporal dependencies in sequence data.

4. Generative Adversarial Networks (GANs):

o Generates new data samples similar to the input dataset.

Visualization Concepts for Neural Networks

To understand how a neural network processes and learns from data, visual representations
are essential. These visuals help explain the layers, activations, weights, and learning
process.

1. Neural Network Architecture Diagram

• Input Layer: Circles represent input features (e.g., numerical data, image pixels).

• Hidden Layers: Multiple layers of circles representing neurons connected by lines

showing weights.

• Output Layer: Final layer predicting the result (classification or regression).

Example Diagram Components

• Color-coded nodes indicate activated neurons.

• Edge thickness represents the magnitude of weights.

• Labels showing activation functions used in each layer.

2. Forward Propagation Flow

• Visualizing how input values propagate through the network:

o Inputs are multiplied by weights and added to biases.

o Activation functions transform the results.

o Outputs flow to the next layer until the final prediction.

3. Activation Functions Plots

Function Graph Description

Sigmoid Smooth curve between 0 and 1, resembling an "S." Useful for binary classification.

Tanh Curve oscillating between -1 and 1, centered around 0.

Straight line for positive inputs and flat at zero for negatives. Efficient for deep
ReLU
learning models.

Softmax A graph showing probabilities assigned to multiple classes, summing to 1.

4. Loss Function Graphs

• Loss vs. Iteration Plot: Shows how the loss reduces as the network trains over
epochs.

• Types of Loss Functions:

o Mean Squared Error (MSE) for regression.

o Cross-Entropy Loss for classification.

5. Weight Updates (Backward Propagation) Visualization

• Diagrams depicting how gradients are calculated and how weights are adjusted.

• Heatmaps to show weight magnitude changes during training.

6. Cluster Visualization for Classification Problems

• 2D or 3D plots:

o Visualize how decision boundaries separate classes after training.

o Points represent data samples, color-coded by class.

7. Convolutional Neural Network (CNN) Visualization

• Feature Maps: Show what each convolutional layer learns.

• Filters highlight edges, textures, or specific features from the input image.

8. Recurrent Neural Network (RNN) Visualization

• Graphs show how hidden states capture temporal dependencies:

o Arrows loop back to previous layers.

o Time-series plots demonstrate sequence modeling.

9. Generative Adversarial Networks (GANs)

• Visualizing how the generator creates fake samples and how the discriminator
distinguishes real from fake.

• Progress over epochs can be shown in image generation tasks (e.g., face generation).

Would you like a visual neural network diagram or interactive plotting examples using
Python libraries like Matplotlib and TensorFlow?
an approach to visually represent neural networks using Python and relevant plotting
techniques. This example walks through key concepts and provides visual outputs to help
illustrate components like layers, weights, activations, and outputs.

Example Neural Network Visualization

We’ll build a simple neural network and visualize key components.

Neural Network Structure

• Input Layer: Two input features.

• Hidden Layer: Two neurons.

• Output Layer: One output neuron.

Python Code for Visualization

python

Copy code

import numpy as np

import matplotlib.pyplot as plt

# Define activation function (Sigmoid for example)

def sigmoid(x):

return 1 / (1 + np.exp(-x))

# Visualization function

def visualize_neural_network(input_values, weights_hidden, weights_output,

biases_hidden, biases_output):

fig, ax = plt.subplots(figsize=(8, 6))

# Plot input layer

input_layer_pos = [(0, 1), (0, 3)]

hidden_layer_pos = [(2, 0.5), (2, 3.5)]

output_layer_pos = [(4, 2)]

# Draw neurons

for pos in input_layer_pos:

ax.add_patch(plt.Circle(pos, 0.2, color='lightblue', ec='black', label='Input Neuron'))

for pos in hidden_layer_pos:

ax.add_patch(plt.Circle(pos, 0.2, color='lightgreen', ec='black', label='Hidden Neuron'))

for pos in output_layer_pos:

ax.add_patch(plt.Circle(pos, 0.2, color='salmon', ec='black', label='Output Neuron'))

# Draw connections

for in_pos in input_layer_pos:

for h_pos in hidden_layer_pos:

ax.plot([in_pos[0], h_pos[0]], [in_pos[1], h_pos[1]], 'gray', linestyle='dashed')

for h_pos in hidden_layer_pos:

for out_pos in output_layer_pos:

ax.plot([h_pos[0], out_pos[0]], [h_pos[1], out_pos[1]], 'gray', linestyle='dashed')

# Set labels and appearance

plt.text(0, -0.5, "Input Layer", ha='center', fontsize=12, weight='bold')

plt.text(2, -0.5, "Hidden Layer", ha='center', fontsize=12, weight='bold')

plt.text(4, -0.5, "Output Layer", ha='center', fontsize=12, weight='bold')

ax.axis('off')

plt.title("Simple Neural Network Visualization")

plt.show()

# Simulate network weights and biases

input_values = np.array([0.5, 0.8])

weights_hidden = np.array([[0.2, 0.4], [0.3, 0.9]])

weights_output = np.array([0.6, 0.8])

biases_hidden = np.array([0.1, 0.2])

biases_output = np.array([0.05])

# Call visualization function

visualize_neural_network(input_values, weights_hidden, weights_output, biases_hidden,

biases_output)

What This Visualization Shows:

• Input Layer (Blue): Represents the input data features.

• Hidden Layer (Green): Processes information and applies weights, biases, and
activations.

• Output Layer (Red): Produces the final prediction.

• Connections: The gray dashed lines represent weights between neurons in adjacent
layers.

Analyzed

python

Always show details

Copy code

import numpy as np

import matplotlib.pyplot as plt

# Define activation function (Sigmoid for example)

def sigmoid(x):
return 1 / (1 + np.exp(-x))

# Visualization function

def visualize_neural_network(input_values, weights_hidden, weights_output,

biases_hidden, biases_output):

fig, ax = plt.subplots(figsize=(8, 6))

# Plot input layer

input_layer_pos = [(0, 1), (0, 3)]

hidden_layer_pos = [(2, 0.5), (2, 3.5)]

output_layer_pos = [(4, 2)]

# Draw neurons

for pos in input_layer_pos:

ax.add_patch(plt.Circle(pos, 0.2, color='lightblue', ec='black'))

for pos in hidden_layer_pos:

ax.add_patch(plt.Circle(pos, 0.2, color='lightgreen', ec='black'))

for pos in output_layer_pos:

ax.add_patch(plt.Circle(pos, 0.2, color='salmon', ec='black'))

# Draw connections

for in_pos in input_layer_pos:

for h_pos in hidden_layer_pos:

ax.plot([in_pos[0], h_pos[0]], [in_pos[1], h_pos[1]], 'gray', linestyle='dashed')

for h_pos in hidden_layer_pos:

for out_pos in output_layer_pos:

ax.plot([h_pos[0], out_pos[0]], [h_pos[1], out_pos[1]], 'gray', linestyle='dashed')

# Set labels and appearance

plt.text(0, -0.5, "Input Layer", ha='center', fontsize=12, weight='bold')

plt.text(2, -0.5, "Hidden Layer", ha='center', fontsize=12, weight='bold')

plt.text(4, -0.5, "Output Layer", ha='center', fontsize=12, weight='bold')

ax.axis('off')

plt.title("Simple Neural Network Visualization")

plt.show()

# Simulate network weights and biases

input_values = np.array([0.5, 0.8])

weights_hidden = np.array([[0.2, 0.4], [0.3, 0.9]])

weights_output = np.array([0.6, 0.8])

biases_hidden = np.array([0.1, 0.2])

biases_output = np.array([0.05])

# Call visualization function

visualize_neural_network(input_values, weights_hidden, weights_output, biases_hidden,

biases_output)

The visualization represents a simple neural network architecture:

• Input Layer: Contains two input neurons (representing the input features).

• Hidden Layer: Includes two neurons in the hidden layer, connected to the input layer
by dashed lines.

• Output Layer: Contains a single neuron representing the network's output.

This visualization helps illustrate how neurons across layers connect to form the neural
network structure. The weights and biases, though not explicitly shown here, affect how
each neuron processes the input to generate output predictions in practice

Mathematics Behind Artificial Neural Networks (ANN)

To understand the mathematical operations within an ANN, key components like cost
functions, derivatives, chain rule, and gradient descent play a vital role.

1. Cost Function

The cost function quantifies the error between the predicted output and the actual output.
For regression problems, a common choice is the Mean Squared Error (MSE):

J(W,b)=1m∑i=1m(hW(x(i))−y(i))2J(W, b) = \frac{1}{m} \sum_{i=1}^{m} (h_W(x^{(i)}) -

y^{(i)})^2J(W,b)=m1i=1∑m(hW(x(i))−y(i))2

Where:

• J(W,b)J(W, b)J(W,b): Cost function.

• hW(x(i))h_W(x^{(i)})hW(x(i)): Predicted value for input x(i)x^{(i)}x(i) using weights

WWW and biases bbb.

• y(i)y^{(i)}y(i): Actual output for input x(i)x^{(i)}x(i).

• mmm: Number of training samples.

2. Activation Functions and Their Derivatives

Activation functions introduce non-linearity into the network. Common activation functions
include:

Sigmoid Activation Function

σ(z)=11+e−z\sigma(z) = \frac{1}{1 + e^{-z}}σ(z)=1+e−z1

• Derivative: σ′(z)=σ(z)⋅(1−σ(z))\sigma'(z) = \sigma(z) \cdot (1 -

\sigma(z))σ′(z)=σ(z)⋅(1−σ(z))

ReLU (Rectified Linear Unit)

f(z)=max⁡(0,z)f(z) = \max(0, z)f(z)=max(0,z)

• Derivative: f′(z)=1f'(z) = 1f′(z)=1 if z>0z > 0z>0; otherwise, f′(z)=0f'(z) = 0f′(z)=0

3. Backpropagation Using Chain Rule

Backpropagation computes gradients for the weights and biases by applying the chain rule
of calculus.

For a simple network:

1. Forward Pass: Compute predictions using weights and biases.

2. Backward Pass: Calculate gradients of the cost function with respect to weights and
biases.

If the output a[L]a^{[L]}a[L] is a function of weights W[L]W^{[L]}W[L] and input xxx, using the
chain rule:

∂J∂W[L]=∂J∂a[L]⋅∂a[L]∂z[L]⋅∂z[L]∂W[L]\frac{\partial J}{\partial W^{[L]}} = \frac{\partial

J}{\partial a^{[L]}} \cdot \frac{\partial a^{[L]}}{\partial z^{[L]}} \cdot \frac{\partial
z^{[L]}}{\partial W^{[L]}}∂W[L]∂J=∂a[L]∂J⋅∂z[L]∂a[L]⋅∂W[L]∂z[L]

4. Gradient Descent Algorithm

Gradient descent updates weights and biases to minimize the cost function:

W:=W−α∂J∂WW := W - \alpha \frac{\partial J}{\partial W}W:=W−α∂W∂J b:=b−α∂J∂bb := b -

\alpha \frac{\partial J}{\partial b}b:=b−α∂b∂J

Where:

• α\alphaα is the learning rate.

• ∂J∂W\frac{\partial J}{\partial W}∂W∂J and ∂J∂b\frac{\partial J}{\partial b}∂b∂J are

gradients.

Example Dataset: Binary Classification

Feature 1 Feature 2 Label

0.2 0.8 0

0.4 0.6 0

0.6 0.4 1

0.8 0.2 1
Problem Example

1. Input: Features x1x_1x1 and x2x_2x2

2. Target Output: yyy (either 0 or 1)

3. Initialization: Random weights WWW and biases bbb

Step-by-Step Process

1. Forward Pass: Compute weighted sum:

z=W1⋅x1+W2⋅x2+bz = W_1 \cdot x_1 + W_2 \cdot x_2 + bz=W1⋅x1+W2⋅x2+b

Apply activation function (e.g., Sigmoid):

a=σ(z)a = \sigma(z)a=σ(z)

2. Compute Cost Function: Calculate the error using MSE or binary cross-entropy.

3. Backward Pass: Compute gradients using the chain rule.

4. Update Parameters: Apply gradient descent to update weights and biases.

This process continues iteratively until the cost function converges to a minimum, allowing
the network to learn patterns in the data and make accurate predictions.

detailed Python implementation and step-by-step walkthrough of training a simple neural

network for binary classification using a dataset similar to the one provided.

Problem Statement

We will use a simple dataset where each point has two features and a binary target (either 0
or 1). The goal is to train a neural network to classify these points correctly using
backpropagation and gradient descent.

Step-by-Step Python Example

1. Dataset Preparation

python

Copy code

import numpy as np
import matplotlib.pyplot as plt

# Simple dataset: 4 points with two features and binary labels

X = np.array([[0.2, 0.8],

[0.4, 0.6],

[0.6, 0.4],

[0.8, 0.2]])

y = np.array([[0], [0], [1], [1]])

# Plot dataset

plt.scatter(X[:, 0], X[:, 1], c=y.flatten(), cmap='viridis')

plt.title("Training Data")

plt.xlabel("Feature 1")

plt.ylabel("Feature 2")

plt.show()

2. Neural Network Functions

1. Activation Function (Sigmoid) and Its Derivative

python

Copy code

def sigmoid(z):

return 1 / (1 + np.exp(-z))

def sigmoid_derivative(z):

return z * (1 - z)

3. Training the Neural Network

python

Copy code

# Initialize weights and biases randomly

np.random.seed(1)

weights = np.random.randn(2, 1)

bias = np.random.randn(1)

learning_rate = 0.1

# Number of iterations for training

epochs = 10000

# Training loop

for epoch in range(epochs):

# Forward propagation

z = np.dot(X, weights) + bias

predictions = sigmoid(z)

# Compute the cost (binary cross-entropy loss)

cost = np.mean(-y * np.log(predictions) - (1 - y) * np.log(1 - predictions))

# Backpropagation

error = predictions - y

gradients = np.dot(X.T, error * sigmoid_derivative(predictions))

bias_gradient = np.sum(error * sigmoid_derivative(predictions))

# Update weights and bias

weights -= learning_rate * gradients

bias -= learning_rate * bias_gradient

# Print cost every 1000 iterations

if epoch % 1000 == 0:

print(f"Epoch {epoch}, Cost: {cost}")

4. Visualizing Decision Boundary

python

Copy code

# Generate a grid for visualization

xx, yy = np.meshgrid(np.linspace(0, 1, 100), np.linspace(0, 1, 100))

grid_points = np.c_[xx.ravel(), yy.ravel()]

grid_predictions = sigmoid(np.dot(grid_points, weights) + bias)

# Plot decision boundary

plt.contourf(xx, yy, grid_predictions.reshape(xx.shape), alpha=0.8, cmap='viridis')

plt.scatter(X[:, 0], X[:, 1], c=y.flatten(), edgecolors='k')

plt.title("Decision Boundary")

plt.xlabel("Feature 1")

plt.ylabel("Feature 2")

plt.show()

Final Output

• Cost: The cost decreases over epochs, showing convergence.

• Decision Boundary: A visualization will clearly separate points with labels 0 and 1.

Diagram Explanation

The decision boundary plot shows how the neural network separates the data points based
on learned weights and biases.
Extending the Neural Network with More Layers (Deep Learning Concepts)

To build a more powerful neural network, we introduce hidden layers, nonlinear activation
functions, and deeper architectures. Let's break it down step by step using modern libraries
for scalability and efficiency.

Key Concepts for Extending the Model

1. Hidden Layers: Intermediate layers between input and output to capture complex
patterns.

2. Activation Functions: Nonlinear functions applied at each layer to model complex

data relationships. Examples:

o ReLU (Rectified Linear Unit): f(x)=max⁡(0,x)f(x) = \max(0, x)f(x)=max(0,x)

o Sigmoid: Useful for binary classification.

o Softmax: For multi-class classification tasks.

3. Cost Function: Cross-entropy for classification problems.

4. Gradient Descent Variants: Optimizers like Adam, RMSprop improve training

efficiency.

Building a Multi-Layer Neural Network using TensorFlow

We will use TensorFlow to create a deep neural network for the same classification problem.

1. Import Necessary Libraries

python

Copy code

import numpy as np

import tensorflow as tf

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense

import matplotlib.pyplot as plt

2. Prepare the Dataset

python

Copy code

# Same dataset as before

X = np.array([[0.2, 0.8],

[0.4, 0.6],

[0.6, 0.4],

[0.8, 0.2]])

y = np.array([[0], [0], [1], [1]]) # Binary labels

3. Define the Neural Network Architecture

python

Copy code

# Define a sequential model

model = Sequential([

Dense(4, input_dim=2, activation='relu'), # Hidden layer with 4 neurons

Dense(2, activation='relu'), # Additional hidden layer

Dense(1, activation='sigmoid') # Output layer for binary classification

])

4. Compile and Train the Model

python

Copy code

# Compile the model

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model

history = model.fit(X, y, epochs=200, verbose=0)

# Plot the training loss over epochs

plt.plot(history.history['loss'])

plt.title("Model Loss During Training")

plt.xlabel("Epochs")

plt.ylabel("Loss")

plt.show()

5. Visualize the Decision Boundary

python

Copy code

# Generate grid for visualization

xx, yy = np.meshgrid(np.linspace(0, 1, 100), np.linspace(0, 1, 100))

grid_points = np.c_[xx.ravel(), yy.ravel()]

grid_predictions = model.predict(grid_points)

# Plot decision boundary

plt.contourf(xx, yy, grid_predictions.reshape(xx.shape), alpha=0.8, cmap='viridis')

plt.scatter(X[:, 0], X[:, 1], c=y.flatten(), edgecolors='k')

plt.title("Decision Boundary (Deep Neural Network)")

plt.xlabel("Feature 1")

plt.ylabel("Feature 2")

plt.show()

Understanding the Extensions

1. Hidden Layers:

o Capture more complex relationships in data.

o Learn multiple levels of abstraction.

2. Activation Functions:

o ReLU helps avoid vanishing gradients.

o Sigmoid provides probabilistic outputs in binary classification.

3. Optimizers:

o Adam dynamically adjusts learning rates for efficient convergence.

4. Model Loss:

o A decreasing loss indicates better fit to the data.

Introduction to Genetic Algorithm Neural Networks
No ratings yet
Introduction to Genetic Algorithm Neural Networks
44 pages
DL Practical
No ratings yet
DL Practical
23 pages
IMPLEMENT A NEURAL NETWORK USING PYTHON
No ratings yet
IMPLEMENT A NEURAL NETWORK USING PYTHON
5 pages
IMPLEMENT A NEURAL NETWORK USING PYTHON
No ratings yet
IMPLEMENT A NEURAL NETWORK USING PYTHON
4 pages
Assignment 2 Dl
No ratings yet
Assignment 2 Dl
10 pages
Deep Learning for Vision Lab Manual 2024
100% (1)
Deep Learning for Vision Lab Manual 2024
25 pages
DL Lab-III-II
No ratings yet
DL Lab-III-II
98 pages
DL-basics-of-neural-networks-MNIST-dataset.ipynb - Colab
No ratings yet
DL-basics-of-neural-networks-MNIST-dataset.ipynb - Colab
5 pages
DL Lab-final
No ratings yet
DL Lab-final
22 pages
09-Neural Networks
No ratings yet
09-Neural Networks
17 pages
DL Practical 3 (1)
No ratings yet
DL Practical 3 (1)
5 pages
DL_LAB_MANUAL_mugesh
No ratings yet
DL_LAB_MANUAL_mugesh
12 pages
DEEP LEARNING LAB MANUAL
No ratings yet
DEEP LEARNING LAB MANUAL
11 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Deep Learning lab with Tensorflow (2)
No ratings yet
Deep Learning lab with Tensorflow (2)
84 pages
Classifying Hand-Written Digits Using Neural Network
No ratings yet
Classifying Hand-Written Digits Using Neural Network
21 pages
GK Deeplearning
No ratings yet
GK Deeplearning
15 pages
Final Code
No ratings yet
Final Code
16 pages
Deep Learning Lab With Output
No ratings yet
Deep Learning Lab With Output
12 pages
Dlv Lab Manual Print
No ratings yet
Dlv Lab Manual Print
29 pages
Case Study - AP23322130042
No ratings yet
Case Study - AP23322130042
7 pages
How to Develop a CNN for MNIST Handwritten Digit Classification
No ratings yet
How to Develop a CNN for MNIST Handwritten Digit Classification
43 pages
AIML Lab 3
No ratings yet
AIML Lab 3
17 pages
Deep Learning Lab Practicals
No ratings yet
Deep Learning Lab Practicals
24 pages
dl lab1
No ratings yet
dl lab1
15 pages
nndlmac
No ratings yet
nndlmac
9 pages
DL Mannual For Reference
No ratings yet
DL Mannual For Reference
58 pages
lab 6 ml
No ratings yet
lab 6 ml
7 pages
Image Classification using MNIST Dataset
No ratings yet
Image Classification using MNIST Dataset
28 pages
Artificial Intelligence Mini Project
No ratings yet
Artificial Intelligence Mini Project
5 pages
Explore the Implementation of CNNs in Python
No ratings yet
Explore the Implementation of CNNs in Python
10 pages
Mardion
No ratings yet
Mardion
16 pages
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
No ratings yet
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
9 pages
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
No ratings yet
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
9 pages
Digit Recognizer Using CNN
No ratings yet
Digit Recognizer Using CNN
4 pages
Deep Learning Practical
No ratings yet
Deep Learning Practical
12 pages
ASNM Program Explain
No ratings yet
ASNM Program Explain
4 pages
21BCP167_AI_9
No ratings yet
21BCP167_AI_9
10 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
Assignment 02# - Machine Learning 2023
No ratings yet
Assignment 02# - Machine Learning 2023
8 pages
deep Learning
No ratings yet
deep Learning
31 pages
Project Documentation
No ratings yet
Project Documentation
24 pages
nndlrepo2
No ratings yet
nndlrepo2
3 pages
CH 02 Summary
No ratings yet
CH 02 Summary
3 pages
DL Programs
No ratings yet
DL Programs
12 pages
NNDL_RECORD_MANUAL
No ratings yet
NNDL_RECORD_MANUAL
36 pages
Vineela Ann1
No ratings yet
Vineela Ann1
9 pages
TLM for CNN
No ratings yet
TLM for CNN
32 pages
361 Project Code
No ratings yet
361 Project Code
10 pages
exno 4
No ratings yet
exno 4
3 pages
Introduction To Keras
No ratings yet
Introduction To Keras
14 pages
Deep Learning Practical File
No ratings yet
Deep Learning Practical File
36 pages
DEEP LEARNING EXPERIMENTS
No ratings yet
DEEP LEARNING EXPERIMENTS
42 pages
projectAI(1)
No ratings yet
projectAI(1)
2 pages
A First Look On Nueral Network
No ratings yet
A First Look On Nueral Network
8 pages
DEEPLEARNINGTUTORIAL.ipynb-Colaboratory
No ratings yet
DEEPLEARNINGTUTORIAL.ipynb-Colaboratory
8 pages
Deep Learning
No ratings yet
Deep Learning
46 pages
keras
No ratings yet
keras
4 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
Recruiting With Integration To Position Management and Onboarding Test Script 3EK
No ratings yet
Recruiting With Integration To Position Management and Onboarding Test Script 3EK
188 pages
Hands - On Talent Pool Rec 08 041
No ratings yet
Hands - On Talent Pool Rec 08 041
7 pages
ENDUSER - E-Signing Section 3 of Form I-9 For United States - STP 16
No ratings yet
ENDUSER - E-Signing Section 3 of Form I-9 For United States - STP 16
6 pages
ENDUSER - Completing Onboarding Tasks - STP 11
No ratings yet
ENDUSER - Completing Onboarding Tasks - STP 11
4 pages
Using Computer Vision To Recognize Construction Material - A Trustworthy Dataset Perspective - Opt
No ratings yet
Using Computer Vision To Recognize Construction Material - A Trustworthy Dataset Perspective - Opt
14 pages
Characteristics of Assembly Line
No ratings yet
Characteristics of Assembly Line
35 pages
A Novel Approach For Polycystic Ovary Syndrome Prediction Using Machine Learning in Bioinformatics
No ratings yet
A Novel Approach For Polycystic Ovary Syndrome Prediction Using Machine Learning in Bioinformatics
16 pages
Language and Eating Problems in Children
No ratings yet
Language and Eating Problems in Children
10 pages
Sentiment Analysis Using Weka
No ratings yet
Sentiment Analysis Using Weka
3 pages
Parkinsons
No ratings yet
Parkinsons
22 pages
باندات سينتنال 2
No ratings yet
باندات سينتنال 2
18 pages
Report
No ratings yet
Report
19 pages
Hard Net Hardness Aware Discrimination Network For 3d Early Activity Prediction
No ratings yet
Hard Net Hardness Aware Discrimination Network For 3d Early Activity Prediction
17 pages
Assignment 3 28855
No ratings yet
Assignment 3 28855
3 pages
843 Artificial Intelligence Xi Xii
No ratings yet
843 Artificial Intelligence Xi Xii
12 pages
ML Question Bank
No ratings yet
ML Question Bank
7 pages
Expert Systems sc1
No ratings yet
Expert Systems sc1
9 pages
637227449508725497DataMining (Chapter8)
No ratings yet
637227449508725497DataMining (Chapter8)
8 pages
Stock Market Prediction
100% (2)
Stock Market Prediction
27 pages
KNN Algorithm - PPT (Autosaved)
0% (1)
KNN Algorithm - PPT (Autosaved)
8 pages
Machine Learning Techniques For Anomaly Detection: An Overview
No ratings yet
Machine Learning Techniques For Anomaly Detection: An Overview
10 pages
Predicting The Type of Road Traffic Accident For Test Scenario Generation
No ratings yet
Predicting The Type of Road Traffic Accident For Test Scenario Generation
22 pages
Detection of Hyperpartisan News Articles - 2022 - International Journal of Info
No ratings yet
Detection of Hyperpartisan News Articles - 2022 - International Journal of Info
10 pages
Study of Twitter Sentiment Analysis Using Machine
No ratings yet
Study of Twitter Sentiment Analysis Using Machine
7 pages
Syllabus
No ratings yet
Syllabus
2 pages
Machine Learning - CheatSheet
100% (1)
Machine Learning - CheatSheet
2 pages
Application of Colorimetry To Determine Soil Fertility Through Naive Bayes Classification Algorithm
No ratings yet
Application of Colorimetry To Determine Soil Fertility Through Naive Bayes Classification Algorithm
6 pages
Projects 2021 B4
No ratings yet
Projects 2021 B4
96 pages
DBSCAN Algorithm
No ratings yet
DBSCAN Algorithm
15 pages
Machine Learning For Enterprises: Applications, Algorithm Selection, and Challenges
No ratings yet
Machine Learning For Enterprises: Applications, Algorithm Selection, and Challenges
14 pages
Human Diseases Detection Based On Machine Learning Algorithms: A Review
No ratings yet
Human Diseases Detection Based On Machine Learning Algorithms: A Review
13 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
21 pages
Multiple Choice Questions
No ratings yet
Multiple Choice Questions
56 pages
Word Sense Disambiguation: A Survey
No ratings yet
Word Sense Disambiguation: A Survey
16 pages