0% found this document useful (0 votes)
2 views

Machine Learning

The document provides an overview of artificial intelligence (AI), machine learning (ML), and deep learning (DL), explaining their definitions, differences, and applications. It details types of supervised and unsupervised learning, including various algorithms such as linear regression, logistic regression, K-Nearest Neighbour, decision trees, and random forests, along with clustering and association rules in unsupervised learning. Additionally, it highlights the importance of data science in analyzing data for meaningful insights.

Uploaded by

Satyam Sangwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Machine Learning

The document provides an overview of artificial intelligence (AI), machine learning (ML), and deep learning (DL), explaining their definitions, differences, and applications. It details types of supervised and unsupervised learning, including various algorithms such as linear regression, logistic regression, K-Nearest Neighbour, decision trees, and random forests, along with clustering and association rules in unsupervised learning. Additionally, it highlights the importance of data science in analyzing data for meaningful insights.

Uploaded by

Satyam Sangwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Machine Learning

Mastering Supervised Machine Learning:


Techniques, Algorithms, and Applications
What is Artificial Intelligence ?
What is Artificial Intelligence ?
Artificial intelligence, or AI, is a technology that enables computers and
machines to simulate human intelligence and problem-solving capabilities.

In simple words, Artificial intelligence is the science


of making machines that can think like humans

It can do things that are considered "smart."

Examples: Spam filters, Online Banking,


Media Recommendations, etc.
What is Machine Learning ?
What is Machine Learning ?
Machine learning (ML) is a branch of artificial intelligence (AI) and computer
science that focuses on using data and algorithms to enable AI to imitate the
way that humans learn, gradually improving its accuracy.

In simple words, Machine learning is a type of technology that allows


computers to learn from experience and data rather than being explicitly
programmed.
What is Deep Learning ?
What is Deep Learning ?
Deep learning is a subset of machine learning that uses multilayered neural
networks, called deep neural networks, to simulate the complex decision-
making power of the human brain.

To mimic the human brain

Applications: Image recognition - To identify objects and features in images,


such as people, animals, places.

Examples: Generative adversarial networks (GANs), CNN, RNNs


AI vs ML vs DL
What is Data Science
Data science is the study of data to extract meaningful insights for business.

It is a multidisciplinary approach that


combines principles and practices from the
fields of mathematics, statistics, artificial
intelligence, and computer engineering
to analyze large amounts of data.

Example: aggregating a customer's email address,


social media handles, and purchase identifications
in order to identify trends in their behavior.
Types of Machine Learning
Supervised Machine Learning
In which the algorithm is trained on labelled data to make predictions or
decisions based on the data inputs

Supervised learning is where the model is trained on a labelled dataset.

A labelled dataset is one that has both input and output parameters.

The algorithm tries to learn the relationship between the input and output
data so that it can make accurate predictions on new, unseen data.
Types of Supervised Learning
Regression
Used to predict continuous numerical values based on input
features.

Functional relationship between independent variables and a


dependent variable, such as predicting house prices based on
features like size, bedrooms, and location.

The objective is to determine the most suitable function that


characterizes the connection between these variables.
Classification
Classification is a type of supervised learning that categorises input
data into predefined labels.

It involves training a model on labelled examples to learn patterns


between input features and output classes

the target variable is a categorical value. For example, classifying


emails as spam or not.
Linear Regression
Algorithm that computes the linear relationship between the dependent
variable and one or more independent features by fitting a linear equation to
observed data.

Simple Linear Regression This is the simplest form of linear regression, and
it involves only one independent variable and one dependent variable. The
equation for simple linear regression is:
Multiple Linear Regression This involves more than one independent
variable and one dependent variable. The equation for multiple linear
regression is:

Best Fit Line : Our primary objective while using linear regression is
to locate the best-fit line, which implies that the error between the
predicted and actual values should be kept to a minimum.

The best Fit Line equation provides a straight line that represents
the relationship between the dependent and independent
variables.
Problem
Given below data build a machine learning model that can predict home prices based on
square feet area (use linear regression)
Logistic Regression
Logistic regression is used for binary classification where we use sigmoid
function, that takes input as independent variables and produces a
probability value between 0 and 1.

It’s referred to as regression because it


is the extension of linear regression but
is mainly used for classification
problems.
K-Nearest Neighbour(KNN)
K-Nearest Neighbour is one of the simplest Machine Learning algorithms
based on Supervised Learning techniques.

K-NN algorithm stores all the available data and classifies a new data
point based on the similarity. This means when new data appears then it
can be easily classified into a well suite category by using K- NN
algorithm.

K-NN is a non-parametric algorithm, which means it does not make any


assumption on underlying data.
How does K-NN work?
The K-NN working can be explained based on the below algorithm:
Select the number K of the neighbours

Calculate the Euclidean distance of K number of neighbours

Take the K nearest neighbours as per the calculated Euclidean distance.

Among these k neighbours, count the number of the data points in each category.

Assign the new data points to that category for which the number of the neighbor is
maximum.
Decision Tree?
A decision tree is a non-parametric supervised learning algorithm,
which is utilised for both classification and regression tasks.

A flowchart-like structure used to make decisions or predictions

Selecting the Best Attribute: Using a metric like Gini impurity, entropy, or information
gain, the best attribute to split the data is selected.

Splitting the Dataset: The dataset is split into subsets based on the selected attribute.

Repeating the Process: The process is repeated recursively for each subset, creating a
new internal node or leaf node until a stopping criterion is met.
Random Forest
A random forest is an ensemble
learning method that combines the
predictions from multiple decision trees
to produce a more accurate and
stable prediction.

It is a type of supervised learning


algorithm that can be used for both
classification and regression tasks.
References
https://docs.google.com/document/d/126emY__Mx3FmTKXFG-
nKRF6BO3NvERi3XB3s30PqKGw/edit
Thank You
Machine Learning
Mastering Unsupervised Machine Learning:
Techniques, Algorithms, and Applications
Unsupervised Machine Learning
A machine learning technique in which models are not supervised using a training
dataset.

Models itself find the hidden patterns and insights from the given data.

To find the underlying structure of dataset, group that data according to similarities,

Unsupervised learning is helpful for finding useful insights from the data

We do not always have input data with the corresponding output so to solve such cases,
we need unsupervised learning.
Unsupervised Machine Learning
Types of Unsupervised Learning
Algorithm:
Clustering
Clustering is a method of grouping the objects into
clusters such that objects with most similarities remain
into a group

and has less or no similarities with the objects of another


group.

Cluster analysis finds the commonalities between the


data objects and categorizes them as per the presence
and absence of those commonalities.
Clustering
Clustering
Clustering
Steps for K- Means Clustering
Start with the k centroids by putting them in a random place

Compute the distance of every point from the centroids and the cluster accordingly

Adjust the centroids so that they become the centre of gravity for the given cluster

Again re-cluster every point based on their distance with the centroids

Again adjust the centroids

Recompute the cluster and repeat this till the data points stop changing the cluster
Clustering
Clustering
Clustering
Clustering
Clustering
Clustering
Clustering
Clustering
Clustering
Elbow Methods
Elbow Methods
Association
An association rule is an unsupervised learning method
which is used for finding the relationships between
variables in the large database.

It determines the set of items that occurs together in the


dataset.

Association rule makes marketing strategy more effective.

Such as people who buy X item (suppose a bread) are


also tend to purchase Y (Butter/Jam) item. A typical
example of Association rule is Market Basket Analysis.
Advantages of Unsupervised
Learning
Unsupervised learning is used for more complex tasks than
supervised learning because, in unsupervised learning, we
don't have labeled input data.

Unsupervised learning is preferable as it is easy to get


unlabelled data in comparison to labeled data.

Labelled data is costly as compared to the unlabelled data


Real-world unsupervised learning
examples
Anomaly detection: Unsupervised clustering can process
large datasets and discover atypical data points in a
dataset.

Genetic research: Genetic clustering is another common


unsupervised learning example. Hierarchical clustering
algorithms are often used to analyse DNA patterns and
reveal evolutionary relationships.
Recommendation engines: Using association rules, unsupervised machine learning
can help explore transactional data to discover patterns or trends that can be used to
drive personalised recommendations for online retailers.

Customer segmentation: Unsupervised learning is also commonly used to generate


buyer persona profiles by clustering customers’ common traits or purchasing
behaviours.

Fraud detection: Unsupervised learning is beneficial for anomaly detection, revealing


unusual data points in datasets. These insights can help uncover events or behaviours
that deviate from normal patterns in the data, revealing fraudulent transactions or
unusual behavior like bot activity
References
https://docs.google.com/document/d/126emY__Mx3FmTKXFG-
nKRF6BO3NvERi3XB3s30PqKGw/edit
Thank You

You might also like