0% found this document useful (0 votes)
96 views22 pages

Unit 6

Uploaded by

dhruvaldevaliya1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views22 pages

Unit 6

Uploaded by

dhruvaldevaliya1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

UNIT 6: MACHINE LEARNING ALGORITHMS

MACHINE LEARNING

 Machine Learning (ML) is a part of artificial intelligence (AI) that focuses on teaching
computers to learn from data and make decisions without being explicitly programmed.
 ML algorithms learn from various types of data, including images, text, sensor readings,
and historical records.
 Some common ML algorithms include decision trees, neural networks, and support vector
machines.
 However, ML also presents challenges. Overfitting, where models become too specialized
on training data, can lead to poor performance on new data.
 They power self-driving cars, fraud detection systems, personalized shopping experiences,
and virtual assistants like Siri and Alexa.
TYPES OF MACHINE LEARNING
 Supervised learning involves the model learning from
labeled data, where the input data is accompanied
by the correct output.
 Examples include linear regression, logistic
regression, decision trees, support vector machines,
and neural networks.
 Unsupervised learning, on the other hand, deals with
unlabelled data, where the algorithm tries to find
hidden patterns or structure.
 Examples include k-means clustering, hierarchical
clustering, principal component analysis
 Finally, reinforcement learning involves an agent
learning to make decisions by interacting with an
environment to maximize cumulative rewards.
A. SUPERVISED LEARNING ALGORITHMS 1) REGRESSION

 If the change in one variable appears to be accompanied by a change in the other variable
the two variables are said to be correlated and this inter dependence is called correlation.

Types of Correlation:
1. Positive Correlation:
In a positive correlation, both variables move in the same
direction.
As one variable increases, the other also increase.
2. Negative Correlation:
Conversely, in a negative correlation, variables move in opposite
directions.
An increase in one variable is associated with a decrease in the
other.
3. Zero Correlation:
When there is no apparent relationship between two variables,
they are said to have zero correlation.
PEARSON'S CORRELATION COEFFICIENT

 A value of 0 indicates that there is no association between the two variables.


 A value greater than 0 indicates a positive association; that is, as the value of one
variable increases, so does the value of the other variable.
 A value less than 0 indicates a negative association; that is, as the value of one variable
increases, the value of the other variable decreases.
EXAMPLE
REGRESSION EQUATION

 y = a + bx + e
 y –Dependent Variable. It is the variable we want to predict
 x – Independent Variable
 a represents the intercept of the regression line
 b represents the slope of the regression line
 e represents the error or residual x
LINEAR REGRESSION IS FURTHER DIVIDED INTO TWO TYPES:

 a) Simple Linear Regression: The dependent variable's value is predicted using a single
independent variable in simple linear regression.
 b) Multiple Linear Regression: In multiple linear regression, more than one
independent variable is used to predict the value of the dependent variable.

 Applications of Linear Regression:


 Market Analysis: Linear regression helps understand how different factors like pricing,
sales quantity, advertising,
 Sales Forecasting: It predicts future sales by analyzing past sales data
 Predicting Salary Based on Experience:
ALGORITHM 2 CLASSIFICATION

 Classification involves categorizing data into predefined classes or categories.


 How Classification Works
 Classes or Categories: Data is divided into different classes or categories, each
representing a specific outcome or group. For example, in a binary classification scenario,
there are two classes: positive and negative
 Features or Attributes: Each data instance is described by its features or attributes
TYPES OF CLASSIFICATION
Types of
classification

Binary Multi-Class Multi-Label


Classification
Imbalanced
Classification Classification Classification
Classification
tasks with two Classification Classification tasks Classification tasks
class labels. tasks with more where each example with unequally
than two class may belong to distributed class
Eg. Email spam labels multiple class labels. labels
detection - spam
or not , Eg. Face • Photo classification
Medical test -
Eg. Fraud detection
classification - objects present in • Medical diagnostic
Cancer detected
or not
Plant species the photo (bicycle, tests
classification apple, person, etc.)
K- NEAREST NEIGHBOUR ALGORITHM (KNN)

 It provides a simple yet effective method for identifying the category or class of a new data
point based on its similarity to existing data points.
STEPS INVOLVED IN K-NN

 Select the number K of the neighbors


 Calculate the distance of K number of neighbors
 Take the K nearest neighbors as per the calculated Euclidean distance.
 Among these k neighbors, count the number of the data points in each category.
 Assign the new data points to that category for which the number of the neighbor is
maximum.
 Our model is ready.
Applications of KNN:
• Image recognition and classification
• Recommendation systems
• Healthcare diagnostics
• Text mining and sentiment analysis
• Anomaly detection

Advantages of KNN:
• Easy to implement and understand.
• Robust to outliers and noisy data.

Limitations of KNN:
• Computationally expensive, especially for large datasets
B. UNSUPERVISED LEARNING 1) CLUSTERING

 Clustering, or cluster analysis, is a machine learning technique used to group unlabeled


dataset into clusters or groups based on similarity.
 It does it by finding some similar patterns in the unlabelled dataset such as shape, size,
color, behavior, etc.,
 Eg. Imagine you are visiting a shopping center where items are grouped together based on
their similarities
 In a similar way, clustering algorithms group similar data points together based on
common characteristics or features.
HOW CLUSTERING WORKS:

 1) Prepare the Data: Select the right features for clustering


 2) Create Similarity Metrics: Define how similar data points are by comparing their
features.
 3) Run the Clustering Algorithm: Apply a clustering algorithm to group the data.
 4) Interpret the Results: Analyze the clusters to understand what they represent.
TYPES OF CLUSTERING METHODS

 1. Partitioning Clustering : The cluster center is created in such a way that the distance
between the data points of one cluster is minimum as compared to another cluster centroid
 Eg. K-Means Clustering algorithm
2. DENSITY-BASED CLUSTERING

 The density-based clustering method connects the highly-dense areas into clusters.
 The dense areas in data space are divided from each other by sparser areas
3. DISTRIBUTION MODEL-BASED CLUSTERING

 In the distribution model-based clustering method, the data is divided based on the
probability of how a dataset belongs to a particular distribution.
4. HIERARCHICAL CLUSTERING

 In this technique, the dataset is divided into clusters to create a tree-like structure, which is
also called a dendrogram
K- MEANS CLUSTERING

 It classifies the dataset by dividing the samples into different clusters of equal variances.

Steps involved K-Means Clustering:


 Select the number K to decide the number of clusters.
 Select random K points or centroids. (It can be other from the input dataset).
 Assign each data point to their closest centroid, which will form the predefined K clusters.
 Calculate the variance and place a new centroid of each cluster.
 Repeat the third steps, which means reassign each datapoint to the new closest centroid of
each cluster.
 If any reassignment occurs, then go to step-4 else go to FINISH.
 The model is ready
APPLICATIONS OF K-MEANS CLUSTERING:

Applications of K-Means Clustering:


 Market Segmentation: group customers based on similar purchasing behaviours or
demographics for tailored marketing strategies.
 Image Segmentation: partition images into regions of similar colours to aid in tasks like object
detection and compression.
 Document Clustering: categorize documents based on content similarity

Advantages of K-Means Clustering:


 Easy to implement, making it suitable for users of all levels.

Limitations of K-Means Clustering:


 Results can vary based on initial centroid placement.
 Number of clusters must be known beforehand.
SUMMARY OF THE UNIT
Machine
learning
Algorithm

Un-
Supervised
Supervised
learning
learning

Regression y = a + bx
+e clustering
classification

You might also like