0% found this document useful (0 votes)

8 views25 pages

UNIT 4

Uploaded by

Pandu snigdha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views25 pages

UNIT 4

Uploaded by

Pandu snigdha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

UNIT 4

Feature Engineering

Feature Engineering is the process of creating new features or transforming existing features
to improve the performance of a machine-learning model. It involves selecting relevant
information from raw data and transforming it into a format that can be easily understood by
a model. The goal is to improve model accuracy by providing more meaningful and relevant
information.

What is Feature Engineering?

Feature engineering is the process of transforming raw data into features that are suitable
for machine learning models. In other words, it is the process of selecting, extracting, and
transforming the most relevant features from the available data to build more accurate and
efficient machine learning models.

The success of machine learning models heavily depends on the quality of the features used
to train them. Feature engineering involves a set of techniques that enable us to create new
features by combining or transforming the existing ones. These techniques help to highlight
the most important patterns and relationships in the data, which in turn helps the machine
learning model to learn from the data more effectively.
What is a Feature?

In the context of machine learning, a feature (also known as a variable or attribute) is an

individual measurable property or characteristic of a data point that is used as input for a
machine learning algorithm. Features can be numerical, categorical, or text-based, and they
represent different aspects of the data that are relevant to the problem at hand.

● For example, in a dataset of housing prices, features could include the number of
bedrooms, the square footage, the location, and the age of the property. In a
dataset of customer demographics, features could include age, gender, income
level, and occupation.
● The choice and quality of features are critical in machine learning, as they can
greatly impact the accuracy and performance of the model.

Need for Feature Engineering in Machine Learning?

We engineer features for various reasons, and some of the main reasons include:

● Improve User Experience: The primary reason we engineer features is to

enhance the user experience of a product or service. By adding new features, we
can make the product more intuitive, efficient, and user-friendly, which can
increase user satisfaction and engagement.
● Competitive Advantage: Another reason we engineer features is to gain a
competitive advantage in the marketplace. By offering unique and innovative
features, we can differentiate our product from competitors and attract more
customers.
● Meet Customer Needs: We engineer features to meet the evolving needs of
customers. By analyzing user feedback, market trends, and customer behavior, we
can identify areas where new features could enhance the product’s value and meet
customer needs.
● Increase Revenue: Features can also be engineered to generate more revenue. For
example, a new feature that streamlines the checkout process can increase sales, or
a feature that provides additional functionality could lead to more upsells or
cross-sells.
● Future-Proofing: Engineering features can also be done to future-proof a product
or service. By anticipating future trends and potential customer needs, we can
develop features that ensure the product remains relevant and useful in the long
term.

Processes Involved in Feature Engineering

Feature engineering in Machine learning consists of mainly 5 processes: Feature Creation,

Feature Transformation, Feature Extraction, Feature Selection, and Feature Scaling. It is an
iterative process that requires experimentation and testing to find the best combination of
features for a given problem. The success of a machine learning model largely depends on the
quality of the features used in the model.

1. Feature Creation

Feature Creation is the process of generating new features based on domain knowledge or by
observing patterns in the data. It is a form of feature engineering that can significantly
improve the performance of a machine-learning model.

Types of Feature Creation:

1. Domain-Specific: Creating new features based on domain knowledge, such as

creating features based on business rules or industry standards.
2. Data-Driven: Creating new features by observing patterns in the data, such as
calculating aggregations or creating interaction features.
3. Synthetic: Generating new features by combining existing features or
synthesizing new data points.

Why Feature Creation?

1. Improves Model Performance: By providing additional and more relevant

information to the model, feature creation can increase the accuracy and precision
of the model.
2. Increases Model Robustness: By adding additional features, the model can
become more robust to outliers and other anomalies.
3. Improves Model Interpretability: By creating new features, it can be easier to
understand the model’s predictions.
4. Increases Model Flexibility: By adding new features, the model can be made
more flexible to handle different types of data.

2. Feature Transformation

Feature Transformation is the process of transforming the features into a more suitable
representation for the machine learning model. This is done to ensure that the model can
effectively learn from the data.

Types of Feature Transformation:

1. Normalization: Rescaling the features to have a similar range, such as between 0

and 1, to prevent some features from dominating others.
2. Scaling: Scaling is a technique used to transform numerical variables to have a
similar scale, so that they can be compared more easily. Rescaling the features to
have a similar scale, such as having a standard deviation of 1, to make sure the
model considers all features equally.
3. Encoding: Transforming categorical features into a numerical representation.
Examples are one-hot encoding and label encoding.
4. Transformation: Transforming the features using mathematical operations to
change the distribution or scale of the features. Examples are logarithmic, square
root, and reciprocal transformations.

Why Feature Transformation?

1. Improves Model Performance: By transforming the features into a more suitable

representation, the model can learn more meaningful patterns in the data.
2. Increases Model Robustness: Transforming the features can make the model
more robust to outliers and other anomalies.
3. Improves Computational Efficiency: The transformed features often require
fewer computational resources.
4. Improves Model Interpretability: By transforming the features, it can be easier
to understand the model’s predictions.
3. Feature Extraction

Feature Extraction is the process of creating new features from existing ones to provide more
relevant information to the machine learning model. This is done by transforming,
combining, or aggregating existing features.

Types of Feature Extraction:

1. Dimensionality Reduction: Reducing the number of features by transforming the

data into a lower-dimensional space while retaining important information.
Examples are PCA and t-SNE.
2. Feature Combination: Combining two or more existing features to create a new
one. For example, the interaction between two features.
3. Feature Aggregation: Aggregating features to create a new one. For example,
calculating the mean, sum, or count of a set of features.
4. Feature Transformation: Transforming existing features into a new
representation. For example, log transformation of a feature with a skewed
distribution.

Why Feature Extraction?

1. Improves Model Performance: By creating new and more relevant features, the
model can learn more meaningful patterns in the data.
2. Reduces Overfitting: By reducing the dimensionality of the data, the model is
less likely to overfit the training data.
3. Improves Computational Efficiency: The transformed features often require
fewer computational resources.
4. Improves Model Interpretability: By creating new features, it can be easier to
understand the model’s predictions.

4. Feature Selection

Feature Selection is the process of selecting a subset of relevant features from the dataset to
be used in a machine-learning model. It is an important step in the feature engineering
process as it can have a significant impact on the model’s performance.
Types of Feature Selection:

1. Filter Method: Based on the statistical measure of the relationship between the
feature and the target variable. Features with a high correlation are selected.
2. Wrapper Method: Based on the evaluation of the feature subset using a specific
machine learning algorithm. The feature subset that results in the best
performance is selected.
3. Embedded Method: Based on the feature selection as part of the training process
of the machine learning algorithm.

Why Feature Selection?

1. Reduces Overfitting: By using only the most relevant features, the model can
generalize better to new data.
2. Improves Model Performance: Selecting the right features can improve the
accuracy, precision, and recall of the model.
3. Decreases Computational Costs: A smaller number of features requires less
computation and storage resources.
4. Improves Interpretability: By reducing the number of features, it is easier to
understand and interpret the results of the model.

5. Feature Scaling

Feature Scaling is the process of transforming the features so that they have a similar scale.
This is important in machine learning because the scale of the features can affect the
performance of the model.

Types of Feature Scaling:

1. Min-Max Scaling: Rescaling the features to a specific range, such as between 0

and 1, by subtracting the minimum value and dividing by the range.
2. Standard Scaling: Rescaling the features to have a mean of 0 and a standard
deviation of 1 by subtracting the mean and dividing by the standard deviation.
3. Robust Scaling: Rescaling the features to be robust to outliers by dividing them
by the interquartile range.
Why Feature Scaling?

1. Improves Model Performance: By transforming the features to have a similar

scale, the model can learn from all features equally and avoid being dominated by
a few large features.
2. Increases Model Robustness: By transforming the features to be robust to
outliers, the model can become more robust to anomalies.
3. Improves Computational Efficiency: Many machine learning algorithms, such
as k-nearest neighbors, are sensitive to the scale of the features and perform better
with scaled features.
4. Improves Model Interpretability: By transforming the features to have a similar
scale, it can be easier to understand the model’s predictions.

Overall, the goal of feature engineering is to create a set of informative and relevant features
that can be used to train a machine learning model and improve its accuracy and
performance. The specific steps involved in the process may vary depending on the type of
data and the specific machine-learning problem at hand.

Techniques Used in Feature Engineering

Feature engineering is the process of transforming raw data into features that are suitable for
machine learning models. There are various techniques that can be used in feature
engineering to create new features by combining or transforming the existing ones. The
following are some of the commonly used feature engineering techniques:

One-Hot Encoding

One-hot encoding is a technique used to transform categorical variables into numerical values
that can be used by machine learning models. In this technique, each category is transformed
into a binary value indicating its presence or absence. For example, consider a categorical
variable “Colour” with three categories: Red, Green, and Blue. One-hot encoding would
transform this variable into three binary variables: Colour_Red, Colour_Green, and
Colour_Blue, where the value of each variable would be 1 if the corresponding category is
present and 0 otherwise.

Binning

Binning is a technique used to transform continuous variables into categorical variables. In

this technique, the range of values of the continuous variable is divided into several bins, and
each bin is assigned a categorical value. For example, consider a continuous variable “Age”
with values ranging from 18 to 80. Binning would divide this variable into several age groups
such as 18-25, 26-35, 36-50, and 51-80, and assign a categorical value to each age group.

Scaling

The most common scaling techniques are standardization and normalization. Standardization
scales the variable so that it has zero mean and unit variance. Normalization scales the
variable so that it has a range of values between 0 and 1.

Feature Split
Feature splitting is a powerful technique used in feature engineering to improve the
performance of machine learning models. It involves dividing single features into multiple
sub-features or groups based on specific criteria. This process unlocks valuable insights and
enhances the model’s ability to capture complex relationships and patterns within the data.

Text Data Preprocessing

Text data requires special preprocessing techniques before it can be used by machine learning
models. Text preprocessing involves removing stop words, stemming, lemmatization, and
vectorization. Stop words are common words that do not add much meaning to the text, such
as “the” and “and”. Stemming involves reducing words to their root form, such as converting
“running” to “run”. Lemmatization is similar to stemming, but it reduces words to their base
form, such as converting “running” to “run”. Vectorization involves transforming text data
into numerical vectors that can be used by machine learning models.

Moving Window Functions in Time Series Analysis

Definition
A moving window function computes a statistic over a fixed-size sliding window of data in a
time series. It is useful for smoothing, trend detection, and calculating rolling statistics.

Key Concepts

Window Size

Smaller Window: Captures short-term fluctuations but is sensitive to noise.

Larger Window: Provides smoother results but may miss short-term patterns.
Centering

Default: Trailing window (anchored to the right).

Centered Window: Pass center=True to compute statistics over a symmetrical window.
Functions Supported

Rolling Mean: Smooths the data.

Other Stats: Minimum, maximum, standard deviation, etc.
Custom Functions: Use .apply() to define your own function, e.g., interquartile range.
Handling NA Values

Use min_periods to specify the minimum number of non-NA values needed for calculation.
Results are NaN for windows with insufficient valid data.
Multiple Window Sizes

Analyze both short- and long-term trends by chaining .rolling() with different window sizes.
Techniques and Applications
Trend Estimation

Compute rolling averages to identify long-term trends and remove short-term noise.
Smoothing

Smooth noisy data for clearer patterns using rolling statistics.

Seasonality Detection

Detects repeating patterns by setting a window size matching the seasonal cycle.
Outlier Detection

Identify anomalies with robust statistics like the rolling median.

Forecasting

Use rolling stats (mean, variance) as inputs for predictive models.

Advanced Methods
Expanding Window Mean

Includes all observations up to the current time point.

Use .expanding().mean() to compute cumulative statistics.
Exponentially Weighted Moving Average (EWMA)

Weights decline exponentially with time, emphasizing recent data.

Control smoothness with alpha (0 to 1). Higher alpha gives more weight to recent values.
Formula:

St = α⋅Xt+(1−α)⋅St−1

Correlation Analysis

Use .rolling().corr() to compute rolling correlations between two time series.

User-Defined Functions
Interquartile Range (IQR): Measures spread using custom functions with .rolling().apply().
Percentile Rank: Compute the percentile rank of a value using scipy.stats.percentileofscore.
Code Examples:

Rolling Mean
rolling_mean = ts.rolling(window=3).mean()

Expanding Mean
exp_mean = ts.expanding().mean()

Exponentially Weighted Mean

ewma = ts.ewm(alpha=0.3).mean()

Custom Rolling Function (IQR)

def rolling_iqr(window):
return window.quantile(0.75) - window.quantile(0.25)
iqr = ts.rolling(window=20).apply(rolling_iqr)

Percentile Rank
from scipy.stats import percentileofscore
rank = percentileofscore(scores, 9)

These techniques provide powerful tools for time series analysis, enabling the extraction of
meaningful insights and patterns.

Fourier Decomposition

Fourier decomposition is very mathematical and not at all obvious. Figure 5-16 shows an
example of the technique. Any N point signal can be decomposed into N + 2 signals, half of
them sine waves and half of them cosine waves. The lowest frequency cosine wave (called
xC0[n] in this illustration), makes zero complete cycles over the N samples, i.e., it is a DC
signal. The next cosine components: xC1[n], xC2[n], and xC3[n], make 1, 2, and 3 complete
cycles over the N samples, respectively. This pattern holds for the remainder of the cosine
waves, as well as for the sine wave components. Since the frequency of each component is
fixed, the only thing that changes for different signals being decomposed is the amplitude of
each of the sine and cosine waves.
Fourier decomposition is important for three reasons. First, a wide variety of signals are
inherently created from superimposed sinusoids. Audio signals are a good example of this.
Fourier decomposition provides a direct analysis of the information contained in these types of
signals. Second, linear systems respond to sinusoids in a unique way: a sinusoidal input always
results in a sinusoidal output. In this approach, systems are characterized by how they change
the amplitude and phase of sinusoids passing through them. Since an input signal can be
decomposed into sinusoids, knowing how a system will react to sinusoids allows the output of
the system to be found. Third, the Fourier decomposition is the basis for a broad and powerful
area of mathematics called Fourier analysis, and the even more advanced Laplace and
z-transforms. Most cutting-edge DSP algorithms are based on some aspect of these techniques.
Why is it even possible to decompose an arbitrary signal into sine and cosine waves? How are
the amplitudes of these sinusoids determined for a particular signal? What kinds of systems
can be designed with this technique? These are the questions to be answered in later chapters.
The details of the Fourier decomposition are too involved to be presented in this brief
overview. For now, the important idea to understand is that when all of the component
sinusoids are added together, the original signal is exactly reconstructed.

Long Short-Term Memory(LSTM) :

Long Short-Term Memory (LSTM) networks are a type of recurrent neural network (RNN)
designed to better capture long-range dependencies in sequences of data. In deep learning,
LSTMs are particularly useful for tasks where the order of inputs is crucial, such as time
series forecasting, speech recognition, language modeling, and many other sequence-related
tasks.

Key Features of LSTM in Deep Learning:

1. Memory Cell:

○ LSTM introduces a memory cell that allows the network to maintain

information over long periods. This helps address the vanishing gradient
problem often encountered in traditional RNNs, making it better suited for
learning long-term dependencies.
2. Gates:

○ LSTMs utilize gates to control the flow of information. These gates decide
what information should be updated, forgotten, or passed forward. The three
main gates in LSTM are:
■ Forget Gate: Decides which information should be discarded from the
memory cell.
■ Input Gate: Determines what new information should be added to the
memory cell.
■ Output Gate: Controls the output of the memory cell, deciding what
information to pass to the next layer or timestep.
3. Cell State:

○ The cell state is the internal memory of the LSTM unit. It carries information
across time steps and is updated by the gates. It can carry important features
that were learned earlier, allowing the network to maintain long-term context.
4. Capturing Long-Range Dependencies:

○ Unlike vanilla RNNs, LSTMs are able to capture long-range dependencies in

sequential data, making them well-suited for tasks like natural language
processing (NLP) and time series analysis.
5. Gradient Flow:

○ LSTM addresses the vanishing and exploding gradient problems by using a

design where gradients are passed through the gates, allowing for more stable
training over many time steps.

Applications of LSTM in Deep Learning:

1. Natural Language Processing (NLP):

○ Machine Translation: LSTMs can be used in sequence-to-sequence (seq2seq)

models to translate sentences from one language to another.
○ Text Generation: By learning patterns in text, LSTMs can generate
human-like text sequences, which is used in applications like chatbots and
language models.
○ Speech Recognition: LSTMs are used in speech-to-text systems to capture
temporal dependencies in spoken language.
2. Time Series Forecasting:

○ LSTMs are highly effective for predicting future values in time series data
(e.g., stock prices, weather forecasting, energy demand) because of their
ability to learn long-term temporal dependencies.
3. Anomaly Detection:

○ LSTMs can be applied to detect anomalies in sequential data. For example, in

network traffic or sensor data, unusual patterns or outliers can be detected by
examining deviations from learned sequences.
4. Video Analysis:

○ LSTMs can be used for action recognition in videos, where they model
temporal dependencies between frames to identify actions or events over time.
5. Healthcare and Bioinformatics:

○ LSTMs are used for analyzing medical time series data, such as heart rate
signals, EEG data, or patient health records, where sequential patterns are
important.

Advantages of LSTM:

● Handling Long-Term Dependencies: LSTMs overcome the limitations of traditional

RNNs in capturing long-term dependencies by preserving information for long
periods.
● Flexibility: They are versatile and can be applied to many different types of
sequential data, from text to time series.
● Improved Training: LSTM networks avoid the vanishing gradient problem, making
it easier to train deep networks with many layers and long sequences.
Challenges:

● Complexity: LSTMs are computationally more complex than traditional RNNs,

which can make training slower and require more resources.
● Overfitting: Like many deep learning models, LSTMs are prone to overfitting,
especially when the dataset is small.
● Memory and Computation: Due to the gating mechanisms and additional
parameters, LSTMs can be more resource-intensive than simpler models.

Advanced Variants of LSTM:

1. Bidirectional LSTM: Processes the sequence in both forward and backward

directions to capture dependencies from both past and future contexts.
2. Attention Mechanism: Often combined with LSTM to focus on the most relevant
parts of the input sequence, improving performance in tasks like machine translation.
3. GRU (Gated Recurrent Unit): A simplified version of LSTM that uses fewer gates
and can perform similarly in many tasks, making it computationally more efficient.

Image Histogram:

An image histogram is a graphical representation of the distribution of pixel intensities (or

color values) in an image. It is a very useful tool in image processing for analyzing the tonal
distribution and contrast in an image. A histogram provides insights into the brightness,
contrast, and overall quality of the image.

Key Concepts of an Image Histogram:

1. Pixel Intensity:

○ Each pixel has an intensity value (0-255 for grayscale, or separate values for
Red, Green, and Blue in color images).
2. X-axis: Intensity Levels:

○ The x-axis represents the range of intensity values (0 to 255 for grayscale or
separate channels for RGB).
3. Y-axis: Frequency:

○ The y-axis shows the number of pixels at each intensity level, representing the
distribution of pixel values.

Types of Histograms:

1. Grayscale Histogram:

○ For a grayscale image, the histogram consists of 256 bins (one for each
possible intensity level from 0 to 255). A grayscale histogram indicates how
many pixels in the image have a particular gray level intensity.
2. Color Histogram:

○ For a color image, there are three histograms: one for each color channel (Red,
Green, Blue). Each channel's histogram will also have 256 bins, indicating the
distribution of that particular color intensity.

Applications of Image Histograms:

1. Image Enhancement:

○ Histograms are useful for various image enhancement techniques like

contrast adjustment and histogram equalization. By manipulating the
histogram, the contrast of the image can be improved.
○ Histogram Equalization: This technique adjusts the pixel intensity
distribution to make the image's histogram more uniform. It can help in
improving the visibility of features in low-contrast images.
2. Thresholding:

○ Histograms help in setting thresholds for image segmentation. By analyzing

the distribution of pixel values, a threshold can be set to separate objects from
the background.
3. Image Classification:
○ Histograms can be used as features in machine learning algorithms for image
classification tasks. The distribution of pixel values can provide meaningful
information about the content of the image.
4. Identifying Exposure Issues:

○ A histogram can quickly reveal whether an image is overexposed (too many

pixels near 255, indicating too much white) or underexposed (too many pixels
near 0, indicating too much black).

Example of Histogram Analysis:

● Underexposed Image: The histogram might be concentrated towards the left (dark
regions), with few pixels in the higher intensity range.
● Overexposed Image: The histogram might be skewed towards the right (bright
regions), with few pixels in the lower intensity range.
● Balanced Image: A well-balanced histogram will have a distribution of pixel
intensities across the full range, with no significant concentration at one end.

Example in Color Image (RGB):

For a color image, you will usually see three separate histograms:

● Red Histogram: Shows the distribution of the red channel intensity values.
● Green Histogram: Shows the distribution of the green channel intensity values.
● Blue Histogram: Shows the distribution of the blue channel intensity values.

Each of these histograms can help in adjusting the color balance of the image. If one channel
(say, red) dominates, the image may appear too red, and adjustments can be made to correct
the balance.

Tools to Generate and Analyze Histograms:

● Most image processing software, such as Photoshop, GIMP, or programming

libraries like OpenCV or MATLAB, provide tools to generate and manipulate
histograms.
● In Python, Matplotlib and OpenCV are commonly used for visualizing and
processing image histograms.

Example Code in Python (OpenCV and Matplotlib):

import cv2
import matplotlib.pyplot as plt

# Read the image

image = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)

# Calculate the histogram

histogram = cv2.calcHist([image], [0], None, [256], [0, 256])

# Plot the histogram

plt.plot(histogram)
plt.title('Image Histogram')
plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency')
plt.show()

Scale-Invariant Feature Transform(SIFT):

Scale-Invariant Feature Transform (SIFT) is a robust algorithm in computer vision for
detecting and describing local features in images, particularly useful for tasks like object
recognition, image stitching, and 3D reconstruction. It was developed by David Lowe in 1999
and has since become one of the most popular methods for feature extraction. SIFT is
invariant to changes in scale, rotation, and partially invariant to changes in illumination and
affine transformations.

Key Points of SIFT:

1. Scale-Invariance:

○ SIFT detects features that are invariant to scale, meaning it can find the same features
in images that are resized.
2. Rotation-Invariance:

○ SIFT can identify the same features even if the image is rotated.
3. Keypoint Detection:

○ SIFT finds keypoints by identifying locations in an image where there is a strong

change in intensity, such as edges or corners, across multiple scales.
4. Orientation Assignment:

○ Each keypoint is assigned a consistent orientation based on the local image gradient.
This ensures rotation-invariance.
5. Feature Description:

○ SIFT generates a descriptor for each keypoint using the gradient orientations around
the keypoint, creating a robust feature vector for matching keypoints across images.
6. Robust to Noise and Illumination:

○ The algorithm is resistant to changes in lighting, noise, and slight distortions.

7. Matching Features:

○ Once keypoints are detected and described, SIFT can be used to match corresponding
keypoints across different images, useful in tasks like image stitching, 3D
reconstruction, and object recognition.

Applications of SIFT:

● Object Recognition: SIFT features can be used to recognize objects in images by

matching the keypoints between different images of the same object.
● Image Stitching: SIFT can align images for panorama creation by matching keypoints
between overlapping images.
● 3D Reconstruction: SIFT can be used in stereo vision to reconstruct 3D scenes by
matching keypoints from different views.
● Motion Tracking: In video sequences, SIFT can track objects by matching keypoints
across frames.
● Robot Navigation: SIFT helps robots navigate by identifying landmarks through
feature matching.

Limitations:

● Computationally Expensive: SIFT can be slow, especially on large images, as it

involves processing at multiple scales and computing gradients.
● Patent Issues: SIFT was patented, and commercial use of the algorithm required
licensing. This led to alternatives like ORB (Oriented FAST and Rotated BRIEF)
being developed.

Sample Code:

import cv2

import numpy as np

# Load image

image = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)

# Initialize SIFT detector

sift = cv2.SIFT_create()

# Detect keypoints and compute descriptors

keypoints, descriptors = sift.detectAndCompute(image, None)

# Draw keypoints on the image

image_with_keypoints = cv2.drawKeypoints(image, keypoints, None)

# Show the image with keypoints

cv2.imshow('SIFT Keypoints', image_with_keypoints)

cv2.waitKey(0)

cv2.destroyAllWindows()

Convolutional Neural Networks (CNN):

Convolutional Neural Networks (CNNs) are a class of deep learning algorithms specifically
designed for processing structured grid data, such as images. They have revolutionized the
field of computer vision and are now widely used for tasks like image classification, object
detection, segmentation, and more. CNNs leverage several key features that make them
particularly effective for visual data processing.

Key Features of CNNs in Deep Learning:

Here's a more concise version of your points with the most important aspects highlighted,
making them easier to memorize:

Convolutional Layers:

● Core Operation: Slide a filter over the image to detect features (edges, textures).
● Multiple Filters: Different filters capture different features (e.g., horizontal/vertical
edges).

Pooling Layers:

● Max Pooling: Downsamples by selecting the max value in a region (e.g., 2x2).
● Average Pooling: Takes the average value in a region.
● Purpose: Reduces spatial dimensions, lessens computation, and improves translation
invariance.

Fully Connected Layers:

● After convolution and pooling, these layers make predictions by flattening feature
maps into 1D vectors.

Activation Functions:

● ReLU: Introduces non-linearity, helping the model learn complex patterns.

● Others: Sigmoid/Softmax for classification tasks.

Stride:

● Controls how much the filter moves during convolution. Larger strides reduce feature
map size and computation.

Padding:
● Adds extra pixels around the input to preserve spatial dimensions. Types: "valid" (no
padding), "same" (keeps dimensions).

Local Receptive Fields:

● Each neuron connects to a small image region, focusing on local patterns and
reducing parameters.

Weight Sharing:

● The same filter is applied across different regions of the image, reducing parameters
and overfitting.

Hierarchical Feature Learning:

● CNNs learn from low-level features (edges, textures) to high-level features (objects,
shapes).

Transfer Learning:

● Pre-trained models (like VGG, ResNet) can be reused and fine-tuned for specific
tasks, saving data and training time.

Applications of CNNs:

1. Image Classification:
○ CNNs are widely used to classify images into categories. For example,
recognizing whether an image contains a dog, cat, or other objects.
2. Object Detection:
○ CNNs can identify the locations of objects within an image and classify them.
This is commonly done using architectures like YOLO (You Only Look
Once) or Faster R-CNN.
3. Image Segmentation:
○ CNNs are used for pixel-wise classification, where each pixel is labeled as
part of a particular object or background. Techniques like U-Net or FCN
(Fully Convolutional Network) are used for segmentation tasks.
4. Face Recognition:
○ CNNs are commonly used in facial recognition systems, where they learn
distinctive features of faces and match them across different images.

5. Medical Image Analysis:

○ CNNs have been successfully applied in the medical field to analyze MRI
scans, X-rays, and other medical images for tasks such as tumor detection and
organ segmentation.

Advantages of CNNs:

● Automatic Feature Extraction: CNNs learn features directly from the data, reducing
the need for manual feature engineering.
● Parameter Efficiency: Due to weight sharing and local receptive fields, CNNs are
highly efficient in terms of the number of parameters.
● Robustness to Variations: CNNs are robust to variations in input, such as scaling,
rotation, and partial occlusion.
● Hierarchical Learning: CNNs can learn increasingly complex features as they go
deeper, allowing for rich and high-level representations of data.

Challenges:

● Computational Complexity: CNNs, especially deep ones, require substantial

computational resources (e.g., GPUs) and large datasets to train effectively.
● Overfitting: While CNNs are less prone to overfitting compared to fully connected
networks, they can still overfit if the model is too complex or the dataset is too small.

Feature Engineering PDF
No ratings yet
Feature Engineering PDF
19 pages
Batch Processing Modeling and Design PDF
100% (1)
Batch Processing Modeling and Design PDF
262 pages
Class Methods
No ratings yet
Class Methods
6 pages
Feature Engineering
No ratings yet
Feature Engineering
11 pages
Machine_Learning-Note-Modul2[1]
No ratings yet
Machine_Learning-Note-Modul2[1]
20 pages
Feature Engineering
No ratings yet
Feature Engineering
6 pages
What is Feature Engineering
No ratings yet
What is Feature Engineering
9 pages
ML - Unit-2 FULL - Feature Engineering Theory-13!09!24-1
No ratings yet
ML - Unit-2 FULL - Feature Engineering Theory-13!09!24-1
29 pages
NOTES
No ratings yet
NOTES
9 pages
What Is Feature Engineering
No ratings yet
What Is Feature Engineering
2 pages
UNIT 2 PART 2
No ratings yet
UNIT 2 PART 2
6 pages
Feature Engineering
No ratings yet
Feature Engineering
2 pages
Unit 2 Feature Engineering
No ratings yet
Unit 2 Feature Engineering
64 pages
Rajat Agarwal-21bcon630
No ratings yet
Rajat Agarwal-21bcon630
13 pages
Machine Learning
No ratings yet
Machine Learning
35 pages
Feature Engineering and Normalization
No ratings yet
Feature Engineering and Normalization
7 pages
ML UNIT 2 2 Old
No ratings yet
ML UNIT 2 2 Old
15 pages
Feature Engineering: Short Study: Indian Institute of Space Science and Technology, Department of Mathematics
No ratings yet
Feature Engineering: Short Study: Indian Institute of Space Science and Technology, Department of Mathematics
6 pages
ML1
No ratings yet
ML1
69 pages
Feature Engineering For Machine Learning
No ratings yet
Feature Engineering For Machine Learning
41 pages
Unit 6aics
No ratings yet
Unit 6aics
25 pages
life lesson
No ratings yet
life lesson
13 pages
Deep Learning Vocabulary
No ratings yet
Deep Learning Vocabulary
6 pages
Unit - 3 Feature Engineering
No ratings yet
Unit - 3 Feature Engineering
29 pages
Class PPT - Unit2
No ratings yet
Class PPT - Unit2
139 pages
CSC407_Chapter 4
No ratings yet
CSC407_Chapter 4
28 pages
AI-Module 4 - Updated
No ratings yet
AI-Module 4 - Updated
53 pages
AI6322 - Module 4 - Feature Engineering - MODULE
No ratings yet
AI6322 - Module 4 - Feature Engineering - MODULE
25 pages
DSUR_EA2352001010391_W2
No ratings yet
DSUR_EA2352001010391_W2
2 pages
Steps Assignment
No ratings yet
Steps Assignment
6 pages
unit2
No ratings yet
unit2
91 pages
Feature Engineering PDF
No ratings yet
Feature Engineering PDF
19 pages
DM - MOD - 1 Part III
No ratings yet
DM - MOD - 1 Part III
12 pages
Machine Learning Unit-2
No ratings yet
Machine Learning Unit-2
12 pages
CH1
No ratings yet
CH1
64 pages
Feature Engineering
No ratings yet
Feature Engineering
13 pages
NLP 2
No ratings yet
NLP 2
1 page
NN-7
No ratings yet
NN-7
26 pages
Feature Engineering in Machine Learning
No ratings yet
Feature Engineering in Machine Learning
7 pages
06 Feature Engineering
No ratings yet
06 Feature Engineering
24 pages
AI Feature Engineering in Detail (wecompress.com)
No ratings yet
AI Feature Engineering in Detail (wecompress.com)
12 pages
ML-Unit 3
No ratings yet
ML-Unit 3
58 pages
Unit-II
No ratings yet
Unit-II
119 pages
Feature Engineering
No ratings yet
Feature Engineering
21 pages
Summery of Feature Eng
No ratings yet
Summery of Feature Eng
4 pages
ML Unit2 Classppt
No ratings yet
ML Unit2 Classppt
44 pages
Semi Supervised Learning
No ratings yet
Semi Supervised Learning
86 pages
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
Features
No ratings yet
Features
5 pages
Basics of Feature Engineering Marked
No ratings yet
Basics of Feature Engineering Marked
33 pages
Expanded Feature Engineering
No ratings yet
Expanded Feature Engineering
7 pages
Feature Engineering Presentation
No ratings yet
Feature Engineering Presentation
40 pages
1 What Is Feature Engineering - Kaggle
No ratings yet
1 What Is Feature Engineering - Kaggle
6 pages
Explore Feature Engineering
No ratings yet
Explore Feature Engineering
10 pages
C2 - W2 Mlopssadasdsa
No ratings yet
C2 - W2 Mlopssadasdsa
123 pages
Feature Engineering
No ratings yet
Feature Engineering
2 pages
ML week 8
No ratings yet
ML week 8
12 pages
UNIT04
No ratings yet
UNIT04
35 pages
ML Unit 2 Part 2
No ratings yet
ML Unit 2 Part 2
23 pages
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
Performance Evaluation: Queues and Markov
From Everand
Performance Evaluation: Queues and Markov
Pasquale De Marco
No ratings yet
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
BDC Final Record
No ratings yet
BDC Final Record
36 pages
CS 425 / ECE 428 Distributed Systems Fall 2014: Lecture 3: Mapreduce and Hadoop
No ratings yet
CS 425 / ECE 428 Distributed Systems Fall 2014: Lecture 3: Mapreduce and Hadoop
24 pages
Network Commands: Rajesh Kumar Gunupudi It Dept, Vnrvjiet
No ratings yet
Network Commands: Rajesh Kumar Gunupudi It Dept, Vnrvjiet
14 pages
Application On Multilevel Queue Scheduling
No ratings yet
Application On Multilevel Queue Scheduling
10 pages
Timeseries Paper
No ratings yet
Timeseries Paper
1 page
Design and Finite Element Analysis of Shell & Tube Heat Exchanger Using Nano Fluids
No ratings yet
Design and Finite Element Analysis of Shell & Tube Heat Exchanger Using Nano Fluids
87 pages
Author Name Title Paper/Submission ID Submission Date Total Pages Document Type
No ratings yet
Author Name Title Paper/Submission ID Submission Date Total Pages Document Type
92 pages
Time Series Forecasting Using Python: Bachelor of Technology Information Technology
No ratings yet
Time Series Forecasting Using Python: Bachelor of Technology Information Technology
36 pages
Recruitment System
No ratings yet
Recruitment System
18 pages
Mini Projects - SE
No ratings yet
Mini Projects - SE
128 pages
4.2-Homework-Tasks
No ratings yet
4.2-Homework-Tasks
23 pages
Cálculo Bombas
No ratings yet
Cálculo Bombas
10 pages
Air Conditioning Engineering (BAB 5) - Fauzan Aziz R - 2002322007
No ratings yet
Air Conditioning Engineering (BAB 5) - Fauzan Aziz R - 2002322007
16 pages
ECE 575Power System Communication and Control 22.23 PLC
No ratings yet
ECE 575Power System Communication and Control 22.23 PLC
15 pages
Protective Relay Testing
No ratings yet
Protective Relay Testing
9 pages
BS en 50318
100% (1)
BS en 50318
20 pages
Soil Strength Properties and Their Measurement: Lchapter
No ratings yet
Soil Strength Properties and Their Measurement: Lchapter
18 pages
Chapter 9 Physical Optics Numerical Probelms
No ratings yet
Chapter 9 Physical Optics Numerical Probelms
12 pages
Claus Tail Gas Treating Unit (TGTU)
No ratings yet
Claus Tail Gas Treating Unit (TGTU)
3 pages
University of San Agustin College of Health and Allied Medical Professions
No ratings yet
University of San Agustin College of Health and Allied Medical Professions
14 pages
The Carlitz-Scoville-Vaughan Theorem and Its Generalizations
No ratings yet
The Carlitz-Scoville-Vaughan Theorem and Its Generalizations
61 pages
Module 4 - Failure Models Time-Dependent Failure Rate
No ratings yet
Module 4 - Failure Models Time-Dependent Failure Rate
22 pages
Eval 3 Sep2025 Cebu
No ratings yet
Eval 3 Sep2025 Cebu
4 pages
Physics 106 Laboratory Manual: Physics For The Life Sciences Ii
No ratings yet
Physics 106 Laboratory Manual: Physics For The Life Sciences Ii
107 pages
Megger Ttr 310 Turns Ratio Tester
No ratings yet
Megger Ttr 310 Turns Ratio Tester
6 pages
584 & 584HD Forwarder Electrical System: 584: Pak1-Up 1081-UP 584HD: Pbk1-Up 1091-UP
No ratings yet
584 & 584HD Forwarder Electrical System: 584: Pak1-Up 1081-UP 584HD: Pbk1-Up 1091-UP
2 pages
edmo jacks
No ratings yet
edmo jacks
5 pages
Javascript Assignment
No ratings yet
Javascript Assignment
4 pages
Heat Transfer Coefficient - ...
No ratings yet
Heat Transfer Coefficient - ...
6 pages
Practice Set-8 From Ncert CH 1 To 07
No ratings yet
Practice Set-8 From Ncert CH 1 To 07
3 pages
MySQL Commands PDF
No ratings yet
MySQL Commands PDF
12 pages
Analytic Continuation
No ratings yet
Analytic Continuation
2 pages
A User-Defined Element For Dynamic Analysis of Saturated Porous Media in ABAQUS
No ratings yet
A User-Defined Element For Dynamic Analysis of Saturated Porous Media in ABAQUS
17 pages
Malting: Current Practice in Malting
No ratings yet
Malting: Current Practice in Malting
7 pages
Vasudeva, Harkrishan L. - Shirali, Satish - Multivariable Analysis-Springer (2011)
No ratings yet
Vasudeva, Harkrishan L. - Shirali, Satish - Multivariable Analysis-Springer (2011)
405 pages
Debugging Code
No ratings yet
Debugging Code
9 pages
Class 11 Computer Application em Sample Materials
No ratings yet
Class 11 Computer Application em Sample Materials
79 pages
opencv
No ratings yet
opencv
51 pages