0% found this document useful (0 votes)

6 views48 pages

UNIT III;dkd

The document provides an overview of machine learning, including its definition, types (supervised, unsupervised, reinforcement), and applications in various fields such as image recognition, speech recognition, and self-driving cars. It outlines the machine learning life cycle, which consists of steps like data gathering, preparation, analysis, model training, testing, and deployment. Additionally, it distinguishes between artificial intelligence and machine learning, emphasizing that machine learning is a subset of AI focused on learning from data.

Uploaded by

inban0405

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views48 pages

UNIT III;dkd

Uploaded by

inban0405

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

UNIT III SUPERVISED LEARNING 9

Introduction to machine learning – Linear Regression Models: Least squares, single & multiple
variables, Bayesian linear regression, gradient descent, Linear Classification Models: Discriminant
function – Probabilistic discriminative model - Logistic regression, Probabilistic generative model –
Naive Bayes, Maximum margin classifier – Support vector machine, Decision Tree, Random forests

Introduction to Machine Learning

Can a machine also learn from experiences or past data like a human does? So here comes the role
of Machine Learning.

Introduction to Machine Learning

A subset of artificial intelligence known as machine learning focuses primarily on the creation of
algorithms that enable a computer to independently learn from data and previous experiences.

Arthur Samuel first used the term "machine learning" in 1959. It could be summarized as follows:

Without being explicitly programmed, machine learning enables a machine to automatically learn from
data, improve performance from experiences, and predict things.

For the purpose of developing predictive models, machine learning brings together statistics and
computer science.
Algorithms that learn from historical data are either constructed or utilized in machine learning. The
performance will rise in proportion to the quantity of information we provide.

A machine can learn if it can gain more data to improve its performance.

How does Machine Learning work

A machine learning system builds prediction models, learns from previous data, and predicts the
output of new data whenever it receives it.

The amount of data helps to build a better model that accurately predicts the output, which in turn
affects the accuracy of the predicted output.

Let's say we have a complex problem in which we need to make predictions. Instead of writing code,
we just need to feed the data to generic algorithms, which build the logic based on the data and predict
the output.

Our perspective on the issue has changed as a result of machine learning. The Machine Learning
algorithm's operation is depicted in the following block diagram:

Features of Machine Learning:

o Machine learning uses data to detect various patterns in a given dataset.

o It can learn from past data and improve automatically.

o It is a data-driven technology.

o Machine learning is much similar to data mining as it also deals with the huge amount of the
data.

Need for Machine Learning

o Rapid increment in the production of data

o Solving complex problems, which are difficult for a human

o Decision making in various sector including finance

o Finding hidden patterns and extracting useful information from data.

Classification of Machine Learning

At a broad level, machine learning can be classified into three types:

1. Supervised learning

2. Unsupervised learning
3. Reinforcement learning

1) Supervised Learning

The system uses labelled data to build a model that understands the datasets and learns about each one.
After the training and processing are done, we test the model with sample data to see if it can
accurately predict the output.

The mapping of the input data to the output data is the objective of supervised learning. Spam filtering
is an example of supervised learning.

Supervised learning can be grouped further in two categories of algorithms:

o Classification

o Regression
2) Unsupervised Learning

Unsupervised learning is a learning method in which a machine learns without any supervision.

The training is provided to the machine with the set of data that has not been labelled, classified, or
categorized, and the algorithm needs to act on that data without any supervision.

The goal of unsupervised learning is to restructure the input data into new features or a group of
objects with similar patterns.

In unsupervised learning, we don't have a predetermined result. The machine tries to find useful
insights from the huge amount of data. It can be further classifieds into two categories of algorithms:

o Clustering

o Association

3) Reinforcement Learning

Reinforcement learning is a feedback-based learning method, in which a learning agent gets a reward
for each right action and gets a penalty for each wrong action.

The agent learns automatically with these feedbacks and improves its performance.

In reinforcement learning, the agent interacts with the environment and explores it. The goal of an
agent is to get the most reward points, and hence, it improves its performance.

The robotic dog, which automatically learns the movement of his arms, is an example of
Reinforcement learning.

History of Machine Learning

Before some years (about 40-50 years), machine learning was science fiction, but today it is the part of
our daily life.

Machine learning is making our day to day life easy from self-driving cars to Amazon virtual
assistant "Alexa".

However, the idea behind machine learning is so old and has a long history. Below some milestones
are given which have occurred in the history of machine learning:
Machine Learning at present:

The field of machine learning has made significant strides in recent years, and its applications are
numerous, including self-driving cars, Amazon Alexa, Catboats, and the recommender system.

It incorporates clustering, classification, decision tree, SVM algorithms, and reinforcement learning, as
well as unsupervised and supervised learning.

Present day AI models can be utilized for making different expectations, including climate expectation,
sickness forecast, financial exchange examination, and so on.

Applications of Machine learning

Machine learning is a buzzword for today's technology, and it is growing very rapidly day by day. We
are using machine learning in our daily life even without knowing it such as Google Maps, Google
assistant, Alexa, etc. Below are some most trending real-world applications of Machine Learning:
1. Image Recognition:

It is used to identify objects, persons, places, digital images, etc. The popular use case of image
recognition and face detection is, Automatic friend tagging suggestion:

Facebook provides us a feature of auto friend tagging suggestion. Whenever we upload a photo with
our Facebook friends, then we automatically get a tagging suggestion with name, and the technology
behind this is machine learning's face detection and recognition algorithm.

It is based on the Facebook project named "Deep Face," which is responsible for face recognition and
person identification in the picture.

2. Speech Recognition

While using Google, we get an option of "Search by voice," it comes under speech recognition, and
it's a popular application of machine learning.

Speech recognition is a process of converting voice instructions into text, and it is also known as
"Speech to text", or "Computer speech recognition."
At present, machine learning algorithms are widely used by various applications of speech
recognition. Google assistant, Siri, Cortana, and Alexa are using speech recognition technology to
follow the voice instructions.

3. Traffic prediction:

If we want to visit a new place, we take help of Google Maps, which shows us the correct path with the
shortest route and predicts the traffic conditions.

It predicts the traffic conditions such as whether traffic is cleared, slow-moving, or heavily congested
with the help of two ways:

o Real Time location of the vehicle form Google Map app and sensors

o Average time has taken on past days at the same time.

Everyone who is using Google Map is helping this app to make it better. It takes information from the
user and sends back to its database to improve the performance.

4. Product recommendations:

Machine learning is widely used by various e-commerce and entertainment companies such
as Amazon, Netflix, etc., for product recommendation to the user.

Whenever we search for some product on Amazon, then we started getting an advertisement for the
same product while internet surfing on the same browser and this is because of machine learning.

Google understands the user interest using various machine learning algorithms and suggests the
product as per customer interest.

As similar, when we use Netflix, we find some recommendations for entertainment series, movies, etc.,
and this is also done with the help of machine learning.

5. Self-driving cars:

One of the most exciting applications of machine learning is self-driving cars. Machine learning plays
a significant role in self-driving cars. Tesla, the most popular car manufacturing company is working
on self-driving car. It is using unsupervised learning method to train the car models to detect people
and objects while driving.
6. Email Spam and Malware Filtering:

Whenever we receive a new email, it is filtered automatically as important, normal, and spam. We
always receive an important mail in our inbox with the important symbol and spam emails in our spam
box, and the technology behind this is Machine learning. Below are some spam filters used by Gmail:

o Content Filter

o Header filter

o General blacklists filter

o Rules-based filters

o Permission filters

Some machine learning algorithms such as Multi-Layer Perceptron, Decision tree, and Naïve Bayes
classifier are used for email spam filtering and malware detection.

7. Virtual Personal Assistant:

We have various virtual personal assistants such as Google assistant, Alexa, Cortana, Siri. As the
name suggests, they help us in finding the information using our voice instruction. These assistants can
help us in various ways just by our voice instructions such as Play music, call someone, Open an
email, Scheduling an appointment, etc.

These virtual assistants use machine learning algorithms as an important part.

These assistant record our voice instructions, send it over the server on a cloud, and decode it using
ML algorithms and act accordingly.

8. Online Fraud Detection:

Machine learning is making our online transaction safe and secure by detecting fraud transaction.
Whenever we perform some online transaction, there may be various ways that a fraudulent transaction
can take place such as fake accounts, fake ids, and steal money in the middle of a transaction. So to
detect this, Feed Forward Neural network helps us by checking whether it is a genuine transaction or
a fraud transaction.
For each genuine transaction, the output is converted into some hash values, and these values become
the input for the next round. For each genuine transaction, there is a specific pattern which gets change
for the fraud transaction hence, it detects it and makes our online transactions more secure.

Stock Market trading:

Machine learning is widely used in stock market trading. In the stock market, there is always a risk of
up and downs in shares, so for this machine learning's long short term memory neural network is
used for the prediction of stock market trends.

10. Medical Diagnosis:

In medical science, machine learning is used for diseases diagnoses. With this, medical technology is
growing very fast and able to build 3D models that can predict the exact position of lesions in the
brain.

It helps in finding brain tumours and other brain-related diseases easily.

11. Automatic Language Translation:

Nowadays, if we visit a new place and we are not aware of the language then it is not a problem at all,
as for this also machine learning helps us by converting the text into our known languages.

Google's GNMT (Google Neural Machine Translation) provide this feature, which is a Neural Machine
Learning that translates the text into our familiar language, and it called as automatic translation.

The technology behind the automatic translation is a sequence to sequence learning algorithm, which is
used with image recognition and translates the text from one language to another language.

Machine learning Life cycle

Machine learning has given the computer systems the abilities to automatically learn without being
explicitly programmed.

But how does a machine learning system work? So, it can be described using the life cycle of machine
learning.

Machine learning life cycle is a cyclic process to build an efficient machine learning project. The main
purpose of the life cycle is to find a solution to the problem or project.
Machine learning life cycle involves seven major steps, which are given below:

o Gathering Data

o Data preparation

o Data Wrangling

o Analyse Data

o Train the model

o Test the model

o Deployment

o The most important thing in the complete process is to understand the problem and to know the
purpose of the problem. Therefore, before starting the life cycle, we need to understand the
problem because the good result depends on the better understanding of the problem.

o In the complete life cycle process, to solve a problem, we create a machine learning system
called "model", and this model is created by providing "training". But to train a model, we need
data, hence, life cycle starts by collecting data.
1. Gathering Data:

Data Gathering is the first step of the machine learning life cycle. The goal of this step is to identify
and obtain all data-related problems.

In this step, we need to identify the different data sources, as data can be collected from various
sources such as files, database, internet, or mobile devices. It is one of the most important steps of
the life cycle.

The quantity and quality of the collected data will determine the efficiency of the output. The more
will be the data, the more accurate will be the prediction.

This step includes the below tasks:

o Identify various data sources

o Collect data

o Integrate the data obtained from different sources

By performing the above task, we get a coherent set of data, also called as a dataset. It will be used in
further steps.

2. Data preparation

After collecting the data, we need to prepare it for further steps. Data preparation is a step where we
put our data into a suitable place and prepare it to use in our machine learning training.

In this step, first, we put all data together, and then randomize the ordering of data.

This step can be further divided into two processes:

o Data Exploration:
It is used to understand the nature of data that we have to work with.
o We need to understand the characteristics, format, and quality of data.
A better understanding of data leads to an effective outcome. In this, we find Correlations,
general trends, and outliers.
o Data Pre-processing:
Now the next step is pre-processing of data for its analysis.
3. Data Wrangling

Data wrangling is the process of cleaning and converting raw data into a useable format.

It is the process of cleaning the data, selecting the variable to use, and transforming the data in a proper
format to make it more suitable for analysis in the next step.

It is one of the most important steps of the complete process. Cleaning of data is required to address
the quality issues.

It is not necessary that data we have collected is always of our use as some of the data may not be
useful. In real-world applications, collected data may have various issues, including:

o Missing Values

o Duplicate data

o Invalid data

o Noise

So, we use various filtering techniques to clean the data.

It is mandatory to detect and remove the above issues because it can negatively affect the quality of the
outcome.

4. Data Analysis

Now the cleaned and prepared data is passed on to the analysis step. This step involves:

o Selection of analytical techniques

o Building models

o Review the result

The aim of this step is to build a machine learning model to analyze the data using various analytical
techniques and review the outcome.

It starts with the determination of the type of the problems, where we select the machine learning
techniques such as Classification, Regression, Cluster analysis, Association, etc. then build the
model using prepared data, and evaluate the model.
Hence, in this step, we take the data and use machine learning algorithms to build the model.

5. Train Model

We use datasets to train the model using various machine learning algorithms. Training a model is
required so that it can understand the various patterns, rules, and, features.

6. Test Model

Once model has been trained on a given dataset, then we test the model. In this step, we check for the
accuracy of our model by providing a test dataset to it.

Testing the model determines the percentage accuracy of the model as per the requirement of project or
problem.

7. Deployment

If the above-prepared model is producing an accurate result as per our requirement with acceptable
speed, then we deploy the model in the real system.

But before deploying the project, we will check whether it is improving its performance using
available data or not. The deployment phase is similar to making the final report for a project.

Difference between Artificial intelligence and Machine learning

Artificial intelligence and machine learning are the part of computer science that are correlated with
each other.

Although these are two related technologies and sometimes people use them as a synonym for each
other, but still both are the two different terms in various cases.

On a broad level, we can differentiate both AI and ML as:

AI is a bigger concept to create intelligent machines that can simulate human thinking capability and
behavior, whereas, machine learning is an application or subset of AI that allows machines to learn
from data without being programmed explicitly.
Artificial Intelligence

Artificial intelligence is a technology using which we can create intelligent systems that can simulate
human intelligence.

The Artificial intelligence system does not require to be pre-programmed, instead of that, they use such
algorithms which can work with their own intelligence.

It involves machine learning algorithms such as Reinforcement learning algorithm and deep learning
neural networks.

AI is being used in multiple places such as Siri, Google’s AlphaGo, AI in Chess playing, etc.

Based on capabilities, AI can be classified into three types:

o Weak AI

o General AI

o Strong AI

Currently, we are working with weak AI and general AI. The future of AI is Strong AI for which it is
said that it will be intelligent than humans.
Machine learning

Machine learning is about extracting knowledge from the data. It can be defined as,

Machine learning is a subfield of artificial intelligence, which enables machines to learn from past
data or experiences without being explicitly programmed.

Key differences between Artificial Intelligence (AI) and Machine learning (ML):

What is a dataset?

A dataset is a collection of data in which data is arranged in some order. A dataset can contain any
data from a series of an array to a database table. Below table shows an example of the dataset:

Country Age Salary Purchased

India 38 48000 No

France 43 45000 Yes

Germany 30 54000 No
France 48 65000 No

Germany 40 Yes

India 35 58000 Yes

A tabular dataset can be understood as a database table or matrix, where each column corresponds to
a particular variable, and each row corresponds to the fields of the dataset. The most supported file
type for a tabular dataset is "Comma Separated File," or CSV. But to store a "tree-like data," we can
use the JSON file more efficiently.

Types of data in datasets

o Numerical data: Such as house price, temperature, etc.

o Categorical data: Such as Yes/No, True/False, Blue/green, etc.

o Ordinal data:These data are similar to categorical data but can be measured on the basis of
comparison.

Types of datasets

Machine learning incorporates different domains, each requiring explicit sorts of datasets. A few
normal sorts of datasets utilized in machine learning include:

Image Datasets:

Image datasets contain an assortment of images and are normally utilized in computer vision tasks
such as image classification, object detection, and image segmentation.

Examples :

o ImageNet

o CIFAR-10

o MNIST
Text Datasets:

Text datasets comprise textual information, like articles, books, or virtual entertainment posts. These
datasets are utilized in NLP techniques like sentiment analysis, text classification, and machine
translation.

Examples :

o Gutenberg Task dataset

o IMDb film reviews dataset

Time Series Datasets:

Time series datasets include information focuses gathered after some time. They are generally utilized
in determining, abnormality location, and pattern examination.

Examples :

o Securities exchange information

o Climate information

o Sensor readings.

Tabular Datasets:

Tabular datasets are organized information coordinated in tables or calculation sheets. They contain
lines addressing examples or tests and segments addressing highlights or qualities. Tabular datasets are
utilized for undertakings like relapse and arrangement. The dataset given before in the article is an
illustration of a tabular dataset.

Regression Analysis in Machine learning

Regression analysis is a statistical method to model the relationship between a dependent (target) and
independent (predictor) variables with one or more independent variables.

Regression analysis helps us to understand how the value of the dependent variable is changing
corresponding to an independent variable when other independent variables are held fixed. It predicts
continuous/real values such as temperature, age, salary, price, etc.

We can understand the concept of regression analysis using the below example:
Example: Suppose there is a marketing company A, who does various advertisement every year and
get sales on that. The below list shows the advertisement made by the company in the last 5 years and
the corresponding sales:

Now, the company wants to do the advertisement of $200 in the next year and wants to know the
prediction about the sales for this year. So to solve such type of prediction problems in machine
learning, we need regression analysis.

It is mainly used for prediction, forecasting, time series modeling, and determining the
causal-effect relationship between variables.

"Regression shows a line or curve that passes through all the datapoints on target-predictor graph
in such a way that the vertical distance between the datapoints and the regression line is
minimum." The distance between datapoints and line tells whether a model has captured a strong
relationship or not.

Some examples of regression can be as:

o Prediction of rain using temperature and other factors

o Determining Market trends

o Prediction of road accidents due to rash driving.

Terminologies Related to the Regression Analysis:

o Dependent Variable: The main factor in Regression analysis which we want to predict or
understand is called the dependent variable. It is also called target variable.
o Independent Variable: The factors which affect the dependent variables or which are used to
predict the values of the dependent variables are called independent variable, also called as
a predictor.
o Outliers: Outlier is an observation which contains either very low value or very high value in
comparison to other observed values. An outlier may hamper the result, so it should be
avoided.
o Multicollinearity: If the independent variables are highly correlated with each other than other
variables, then such condition is called Multicollinearity. It should not be present in the dataset,
because it creates problem while ranking the most affecting variable.
o Underfitting and Overfitting: If our algorithm works well with the training dataset but not
well with test dataset, then such problem is called Overfitting. And if our algorithm does not
perform well even with training dataset, then such problem is called underfitting.

Linear Regression in Machine Learning

Linear regression is one of the easiest and most popular Machine Learning algorithms. It is a statistical
method that is used for predictive analysis. Linear regression makes predictions for continuous/real or
numeric variables such as sales, salary, age, product price, etc.

Linear regression algorithm shows a linear relationship between a dependent (y) and one or more
independent (y) variables, hence called as linear regression. Since linear regression shows the linear
relationship, which means it finds how the value of the dependent variable is changing according to the
value of the independent variable.

The linear regression model provides a sloped straight line representing the relationship between the
variables. Consider the below image:
Mathematically, we can represent a linear regression as:

y= a0+a1x+ ε

Here,

Y= Dependent Variable (Target Variable)

X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error

The values for x and y variables are training datasets for Linear Regression model representation.

Types of Linear Regression

Linear regression can be further divided into two types of the algorithm:
o Simple Linear Regression:
If a single independent variable is used to predict the value of a numerical dependent variable,
then such a Linear Regression algorithm is called Simple Linear Regression.
o Multiple Linear regression:
If more than one independent variable is used to predict the value of a numerical dependent
variable, then such a Linear Regression algorithm is called Multiple Linear Regression.

Linear Regression Line

A linear line showing the relationship between the dependent and independent variables is called
a regression line. A regression line can show two types of relationship:

o Positive Linear Relationship:

If the dependent variable increases on the Y-axis and independent variable increases on X-axis,
then such a relationship is termed as a Positive linear relationship.

o Negative Linear Relationship:

If the dependent variable decreases on the Y-axis and independent variable increases on the
X-axis, then such a relationship is called a negative linear relationship.
Finding the best fit line:

When working with linear regression, our main goal is to find the best fit line that means the error
between predicted values and actual values should be minimized. The best fit line will have the least
error.

The different values for weights or the coefficient of lines (a0, a1) gives a different line of regression, so
we need to calculate the best values for a0 and a1 to find the best fit line, so to calculate this we use cost
function.

Cost function-
o The different values for weights or coefficient of lines (a0, a1) gives the different line of
regression, and the cost function is used to estimate the values of the coefficient for the best fit
line.
o Cost function optimizes the regression coefficients or weights. It measures how a linear
regression model is performing.
o We can use the cost function to find the accuracy of the mapping function, which maps the
input variable to the output variable. This mapping function is also known as Hypothesis
function.

For Linear Regression, we use the Mean Squared Error (MSE) cost function, which is the average of
squared error occurred between the predicted values and actual values. It can be written as:

For the above linear equation, MSE can be calculated as:

Where,

N=Total number of observation

Yi = Actual value
(a1xi+a0)= Predicted value.

Residuals: The distance between the actual value and predicted values is called residual. If the
observed points are far from the regression line, then the residual will be high, and so cost function
will high. If the scatter points are close to the regression line, then the residual will be small and hence
the cost function.

Gradient Descent:
o Gradient descent is used to minimize the MSE by calculating the gradient of the cost function.

o A regression model uses gradient descent to update the coefficients of the line by reducing the
cost function.
o It is done by a random selection of values of coefficient and then iteratively update the values
to reach the minimum cost function.

Classification Algorithm in Machine Learning

Supervised Machine Learning algorithm can be broadly classified into Regression and Classification
Algorithms.

In Regression algorithms, we have predicted the output for continuous values, but to predict the
categorical values, we need Classification algorithms.

What is the Classification Algorithm?

The Classification algorithm is used to identify the category of new observations on the basis of
training data.

A program learns from the given dataset or observations and then classifies new observation into a
number of classes or groups. Such as, Yes or No, 0 or 1, Spam or Not Spam, cat or dog, etc.
Classes can be called as targets/labels or categories.

The output variable of Classification is a category, not a value, such as "Green or Blue", "fruit or
animal", etc. It takes labeled input data, which means it contains input with the corresponding output.

In classification algorithm, a discrete output function(y) is mapped to input variable(x).

1. y=f(x), where y = categorical output

The best example of an ML classification algorithm is Email Spam Detector.

The main goal of the Classification algorithm is to identify the category of a given dataset, and these
algorithms are mainly used to predict the output for the categorical data.

In the below diagram, there are two classes, class A and Class B. These classes have features that are
similar to each other and dissimilar to other classes.

The algorithm which implements the classification on a dataset is known as a classifier. There are two
types of Classifications:

o Binary Classifier: If the classification problem has only two possible outcomes, then it is
called as Binary Classifier.
Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG, etc.
o Multi-class Classifier: If a classification problem has more than two outcomes, then it is called
as Multi-class Classifier.
Example: Classifications of types of crops, Classification of types of music.

Learners in Classification Problems:

In the classification problems, there are two types of learners:

1. Lazy Learners: Lazy Learner firstly stores the training dataset and wait until it receives the
test dataset. In Lazy learner case, classification is done on the basis of the most related data
stored in the training dataset. It takes less time in training but more time for predictions.
Example: K-NN algorithm, Case-based reasoning
2. Eager Learners: Eager Learners develop a classification model based on a training dataset
before receiving a test dataset. Opposite to Lazy learners, Eager Learner takes more time in
learning, and less time in prediction. Example: Decision Trees, Naïve Bayes, ANN.

Types of ML Classification Algorithms:

Classification Algorithms can be further divided into the Mainly two category:

o Linear Models

o Logistic Regression

o Support Vector Machines

o Non-linear Models

o K-Nearest Neighbours

o Kernel SVM

o Naïve Bayes

o Decision Tree Classification

o Random Forest Classification

Logistic Regression in Machine Learning

o Logistic regression is one of the Supervised Learning technique. It is used for predicting the
categorical dependent variable using a given set of independent variables.
o Logistic regression predicts the output of a categorical dependent variable.
o Therefore the outcome must be a categorical or discrete value. It can be either Yes or No, 0 or
1, true or False, etc. but instead of giving the exact value as 0 and 1, it gives the probabilistic
values which lie between 0 and 1.
o Linear Regression is used for solving Regression problems, whereas Logistic regression is
used for solving the classification problems.
o In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic function,
which predicts two maximum values (0 or 1).
o The curve from the logistic function indicates the likelihood of something such as whether the
cells are cancerous or not, a mouse is obese or not based on its weight, etc.
o Logistic Regression is a significant machine learning algorithm because it has the ability to
provide probabilities and classify new data using continuous and discrete datasets.
o Logistic Regression can be used to classify the observations using different types of data and
can easily determine the most effective variables used for the classification. The below image is
showing the logistic function:

Logistic Function (Sigmoid Function):

o The sigmoid function is a mathematical function used to map the predicted values to
probabilities.
o It maps any real value into another value within a range of 0 and 1.

o The value of the logistic regression must be between 0 and 1, which cannot go beyond this
limit, so it forms a curve like the "S" form. The S-form curve is called the Sigmoid function or
the logistic function.
o We use the concept of the threshold value, which defines the probability of either 0 or 1. Such
as values above the threshold value tends to 1, and a value below the threshold values tends to
0.

Assumptions for Logistic Regression:

o The dependent variable must be categorical in nature.

o The independent variable should not have multi-collinearity.

Logistic Regression Equation:

The mathematical steps to get Logistic Regression equations are given below:

o We know the equation of the straight line can be written as:

o In Logistic Regression y can be between 0 and 1 only, so for this let's divide the above equation
by (1-y):

o But we need range between -[infinity] to +[infinity], then take logarithm of the equation it will
become:

The above equation is the final equation for Logistic Regression.

Type of Logistic Regression:

Logistic Regression can be classified into three types:

o Binomial: There can be only two possible types of the dependent variables, such as 0 or 1, Pass
or Fail, etc.
o Multinomial: There can be 3 or more possible unordered types of the dependent variable, such
as "cat", "dogs", or "sheep"
o Ordinal: There can be 3 or more possible ordered types of dependent variables, such as "low",
"Medium", or "High".

Linear Discriminant Analysis (LDA) in Machine Learning

Linear Discriminant Analysis (LDA) is one of the commonly used dimensionality reduction
techniques in machine learning to solve more than two-class classification problems. It is also
known as Normal Discriminant Analysis (NDA) or Discriminant Function Analysis (DFA).

This can be used to project the features of higher dimensional space into lower-dimensional space in
order to reduce resources and dimensional costs.

What is Linear Discriminant Analysis (LDA)?

Although the logistic regression algorithm is limited to only two-class, linear Discriminant analysis is
applicable for more than two classes of classification problems.

Linear Discriminant analysis is one of the most popular dimensionality reduction techniques used
for supervised classification problems in machine learning. It is also considered a pre-processing step
for modeling differences in ML and applications of pattern classification.

Whenever there is a requirement to separate two or more classes having multiple features efficiently,
the Linear Discriminant Analysis model is considered the most common technique to solve such
classification problems. For e.g., if we have two classes with multiple features and need to separate
them efficiently. When we classify them using a single feature, then it may show overlapping.

To overcome the overlapping issue in the classification process, we must increase the number of
features regularly.
Example:

Let's assume we have to classify two different classes having two sets of data points in a 2-dimensional
plane as shown below image:

However, it is impossible to draw a straight line in a 2-d plane that can separate these data points
efficiently but using linear Discriminant analysis; we can dimensionally reduce the 2-D plane into the
1-D plane. Using this technique, we can also maximize the separability between multiple classes.

How Linear Discriminant Analysis (LDA) works?

Linear Discriminant analysis is used as a dimensionality reduction technique in machine learning,

using which we can easily transform a 2-D and 3-D graph into a 1-dimensional plane.

Let's consider an example where we have two classes in a 2-D plane having an X-Y axis, and we need
to classify them efficiently. As we have already seen in the above example that LDA enables us to
draw a straight line that can completely separate the two classes of the data points. Here, LDA uses an
X-Y axis to create a new axis by separating them using a straight line and projecting data onto a new
axis.

Hence, we can maximize the separation between these classes and reduce the 2-D plane into 1-D.
To create a new axis, Linear Discriminant Analysis uses the following criteria:

o It maximizes the distance between means of two classes.

o It minimizes the variance within the individual class.

Using the above two conditions, LDA generates a new axis in such a way that it can maximize the
distance between the means of the two classes and minimizes the variation within each class.

In other words, we can say that the new axis will increase the separation between the data points of the
two classes and plot them onto the new axis.

Why LDA?

o Logistic Regression is one of the most popular classification algorithms that perform well for
binary classification but falls short in the case of multiple classification problems with
well-separated classes. At the same time, LDA handles these quite efficiently.
o LDA can also be used in data pre-processing to reduce the number of features, just as PCA,
which reduces the computing cost significantly.
o LDA is also used in face detection algorithms. In Fisherfaces, LDA is used to extract useful
data from different faces. Coupled with eigenfaces, it produces effective results.

Drawbacks of Linear Discriminant Analysis (LDA)

LDA fails in cases where the Mean of the distributions is shared. In such case, LDA fails to create a
new axis that makes both the classes linearly separable.

To overcome such problems, we use non-linear Discriminant analysis in machine learning.

Real-world Applications of LDA

o Face Recognition
Face recognition is the popular application of computer vision, where each face is represented
as the combination of a number of pixel values. In this case, LDA is used to minimize the
number of features to a manageable number before going through the classification process. It
generates a new template in which each dimension consists of a linear combination of pixel
values. If a linear combination is generated using Fisher's linear discriminant, then it is called
Fisher's face.
o Medical
In the medical field, LDA has a great application in classifying the patient disease on the basis
of various parameters of patient health and the medical treatment which is going on. On such
parameters, it classifies disease as mild, moderate, or severe. This classification helps the
doctors in either increasing or decreasing the pace of the treatment.
o Customer Identification
In customer identification, LDA is currently being applied. It means with the help of LDA; we
can easily identify and select the features that can specify the group of customers who are
likely to purchase a specific product in a shopping mall. This can be helpful when we want to
identify a group of customers who mostly purchase a product in a shopping mall.
o For Predictions
LDA can also be used for making predictions and so in decision making. For example, "will
you buy this product” will give a predicted result of either one or two possible classes as a
buying or not.
o In Learning
Nowadays, robots are being trained for learning and talking to simulate human work, and it can
also be considered a classification problem. In this case, LDA builds similar groups on the
basis of different parameters, including pitches, frequencies, sound, tunes, etc.
Difference between Linear Discriminant Analysis and PCA

o PCA is an unsupervised algorithm that does not care about classes and labels and only aims to
find the principal components to maximize the variance in the given dataset. At the same time,
LDA is a supervised algorithm that aims to find the linear discriminants to represent the axes
that maximize separation between different classes of data.
o LDA is much more suitable for multi-class classification tasks compared to PCA. However,
PCA is assumed to be an as good performer for a comparatively small sample size.
o Both LDA and PCA are used as dimensionality reduction techniques, where PCA is first
followed by LDA.

Support Vector Machine Algorithm

Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms, which is
used for Classification as well as Regression problems. However, primarily, it is used for Classification
problems in Machine Learning.

The goal of the SVM algorithm is to create the best line or decision boundary that can segregate
n-dimensional space into classes so that we can easily put the new data point in the correct category in
the future. This best decision boundary is called a hyperplane.

SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are
called as support vectors, and hence algorithm is termed as Support Vector Machine.

Consider the below diagram in which there are two different categories that are classified using a
decision boundary or hyperplane:
Example: SVM can be understood with the example that we have used in the KNN classifier. Suppose
we see a strange cat that also has some features of dogs, so if we want a model that can accurately
identify whether it is a cat or dog, so such a model can be created by using the SVM algorithm. We
will first train our model with lots of images of cats and dogs so that it can learn about different
features of cats and dogs, and then we test it with this strange creature. So as support vector creates a
decision boundary between these two data (cat and dog) and choose extreme cases (support vectors), it
will see the extreme case of cat and dog. On the basis of the support vectors, it will classify it as a cat.
Consider the below diagram:
SVM algorithm can be used for Face detection, image classification, text categorization, etc.

Types of SVM

SVM can be of two types:

o Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be
classified into two classes by using a single straight line, then such data is termed as linearly
separable data, and classifier is used called as Linear SVM classifier.
o Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means if a
dataset cannot be classified by using a straight line, then such data is termed as non-linear data
and classifier used is called as Non-linear SVM classifier.

Hyperplane and Support Vectors in the SVM algorithm:

Hyperplane: There can be multiple lines/decision boundaries to segregate the classes in n-dimensional
space, but we need to find out the best decision boundary that helps to classify the data points. This
best boundary is known as the hyperplane of SVM.

The dimensions of the hyperplane depend on the features present in the dataset, which means if there
are 2 features (as shown in image), then hyperplane will be a straight line. And if there are 3 features,
then hyperplane will be a 2-dimension plane.
We always create a hyperplane that has a maximum margin, which means the maximum distance
between the data points.

Support Vectors:

The data points or vectors that are the closest to the hyperplane and which affect the position of the
hyperplane are termed as Support Vector. Since these vectors support the hyperplane, hence called a
Support vector.

How does SVM works?

Linear SVM:

The working of the SVM algorithm can be understood by using an example. Suppose we have a
dataset that has two tags (green and blue), and the dataset has two features x1 and x2. We want a
classifier that can classify the pair(x1, x2) of coordinates in either green or blue. Consider the below
image:

So as it is 2-d space so by just using a straight line, we can easily separate these two classes. But there
can be multiple lines that can separate these classes. Consider the below image:
Hence, the SVM algorithm helps to find the best line or decision boundary; this best boundary or
region is called as a hyperplane. SVM algorithm finds the closest point of the lines from both the
classes. These points are called support vectors. The distance between the vectors and the hyperplane
is called as margin. And the goal of SVM is to maximize this margin. The hyperplane with maximum
margin is called the optimal hyperplane.
Non-Linear SVM:

If data is linearly arranged, then we can separate it by using a straight line, but for non-linear data, we
cannot draw a single straight line. Consider the below image:

So to separate these data points, we need to add one more dimension. For linear data, we have used
two dimensions x and y, so for non-linear data, we will add a third dimension z. It can be calculated as:

z=x2 +y2

By adding the third dimension, the sample space will become as below image:
So now, SVM will divide the datasets into classes in the following way. Consider the below image:

Since we are in 3-d Space, hence it is looking like a plane parallel to the x-axis. If we convert it in 2d
space with z=1, then it will become as:
Hence we get a circumference of radius 1 in case of non-linear data.

SVM Applications

• SVM has been used successfully in many real-world problems,

1. Text (and hypertext) categorization

2. Image classification

3. Bioinformatics (Protein classification, Cancer classification)

4.Hand-written character recognition

5. Determination of SPAM email.

Limitations of SVM

1. It is sensitive to noise.

2. The biggest limitation of SVM lies in the choice of the kernel.

3. Another limitation is speed and size.

4. The optimal design for multiclass SVM classifiers is also a research area.

Soft Margin SVM

• For the very high dimensional problems common in text classification, sometimes the data are
linearly separable. But in the general case they are not, and even if they are, we might prefer a solution
that better separates the bulk of the data 1st while ignoring a few weird noise documents.
• What if the training set is not linearly separable? Slack variables can be added to allow
misclassification of difficult or noisy examples, resulting margin called soft.
• A soft-margin allows a few variables to cross into the margin or over the hyperplane, allowing
misclassification.
• We penalize the crossover by looking at the number and distance of the misclassifications. This is a
trade off between the hyperplane violations and the margin size. The slack variables are bounded by
some set cost. The farther they are from the soft margin, the less influence they have on the prediction.
• All observations have an associated slack variable,
1. Slack variable = 0 then all points on the margin.

2.Slack variable > 0 then a point in the margin or on the wrong side of the hyperplane

3. C is the trade off between the slack variable penalty and the margin.

Decision Tree Classification Algorithm

o Decision Tree is a Supervised learning technique that can be used for both classification and
Regression problems, but mostly it is preferred for solving Classification problems.
o It is a tree-structured classifier, where internal nodes represent the features of a dataset,
branches represent the decision rules and each leaf node represents the outcome.
o In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision
nodes are used to make any decision and have multiple branches, whereas Leaf nodes are the
output of those decisions and do not contain any further branches.
o The decisions or the test are performed on the basis of features of the given dataset.

o It is a graphical representation for getting all the possible solutions to a problem/decision

based on given conditions.
o It is called a decision tree because, similar to a tree, it starts with the root node, which expands
on further branches and constructs a tree-like structure.
o In order to build a tree, we use the CART algorithm, which stands for Classification and
Regression Tree algorithm.
o A decision tree simply asks a question, and based on the answer (Yes/No), it further split the
tree into subtrees.
o Below diagram explains the general structure of a decision tree:
Note: A decision tree can contain categorical data (YES/NO) as well as numeric data.

Why use Decision Trees?

There are various algorithms in Machine learning, so choosing the best algorithm for the given dataset
and problem is the main point to remember while creating a machine learning model. Below are the
two reasons for using the Decision tree:

o Decision Trees usually mimic human thinking ability while making a decision, so it is easy to
understand.
o The logic behind the decision tree can be easily understood because it shows a tree-like
structure.

Decision Tree Terminologies

 Root Node: Root node is from where the decision tree starts. It represents the entire dataset, which
further gets divided into two or more homogeneous sets.
 Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further after
getting a leaf node.
 Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes according
to the given conditions.
 Branch/Sub Tree: A tree formed by splitting the tree.
 Pruning: Pruning is the process of removing the unwanted branches from the tree.
 Parent/Child node: The root node of the tree is called the parent node, and other nodes are called
the child nodes.

How does the Decision Tree algorithm Work?

In a decision tree, for predicting the class of the given dataset, the algorithm starts from the root node
of the tree. This algorithm compares the values of root attribute with the record (real dataset) attribute
and, based on the comparison, follows the branch and jumps to the next node.

For the next node, the algorithm again compares the attribute value with the other sub-nodes and move
further. It continues the process until it reaches the leaf node of the tree. The complete process can be
better understood using the below algorithm:

o Step-1: Begin the tree with the root node, says S, which contains the complete dataset.

o Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).

o Step-3: Divide the S into subsets that contains possible values for the best attributes.

o Step-4: Generate the decision tree node, which contains the best attribute.

o Step-5: Recursively make new decision trees using the subsets of the dataset created in step -3.
Continue this process until a stage is reached where you cannot further classify the nodes and
called the final node as a leaf node.

Example: Suppose there is a candidate who has a job offer and wants to decide whether he should
accept the offer or Not. So, to solve this problem, the decision tree starts with the root node (Salary
attribute by ASM). The root node splits further into the next decision node (distance from the office)
and one leaf node based on the corresponding labels. The next decision node further gets split into one
decision node (Cab facility) and one leaf node. Finally, the decision node splits into two leaf nodes
(Accepted offers and Declined offer). Consider the below diagram:
Attribute Selection Measures

While implementing a Decision tree, the main issue arises that how to select the best attribute for the
root node and for sub-nodes. So, to solve such problems there is a technique which is called
as Attribute selection measure or ASM. By this measurement, we can easily select the best attribute
for the nodes of the tree. There are two popular techniques for ASM, which are:

o Information Gain

o Gini Index

1. Information Gain:
o Information gain is the measurement of changes in entropy after the segmentation of a dataset
based on an attribute.
o It calculates how much information a feature provides us about a class.

o According to the value of information gain, we split the node and build the decision tree.

o A decision tree algorithm always tries to maximize the value of information gain, and a
node/attribute having the highest information gain is split first. It can be calculated using the
below formula:
1. Information Gain= Entropy(S)- [(Weighted Avg) *Entropy(each feature)

Entropy: Entropy is a metric to measure the impurity in a given attribute. It specifies randomness in
data. Entropy can be calculated as:

Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)

Where,

o S= Total number of samples

o P(yes)= probability of yes

o P(no)= probability of no

2. Gini Index:
o Gini Index is utilized to determine the best feature to split the data on at every node of the tree.
o It is a measure of how mixed or impure a dataset is.
o Gini index is a measure of inequality or impurity used while creating a decision tree in the
CART(Classification and Regression Tree) algorithm.
o An attribute with the low Gini index should be preferred as compared to the high Gini index.

o It only creates binary splits, and the CART algorithm uses the Gini index to create binary splits.

o Gini index can be calculated using the below formula:

Gini Index= 1- ∑jPj2

Pruning: Getting an Optimal Decision tree

Pruning is a process of deleting the unnecessary nodes from a tree in order to get the optimal decision
tree.

A too-large tree increases the risk of overfitting, and a small tree may not capture all the important
features of the dataset.

Therefore, a technique that decreases the size of the learning tree without reducing accuracy is known
as Pruning.

There are mainly two types of tree pruning technology used:

o Cost Complexity Pruning

o Reduced Error Pruning.

Advantages of the Decision Tree

o It is simple to understand as it follows the same process which a human follow while making
any decision in real-life.
o It can be very useful for solving decision-related problems.

o It helps to think about all the possible outcomes for a problem.

o There is less requirement of data cleaning compared to other algorithms.

Disadvantages of the Decision Tree

o The decision tree contains lots of layers, which makes it complex.

o It may have an overfitting issue, which can be resolved using the Random Forest algorithm.

o For more class labels, the computational complexity of the decision tree may increase.

Random Forest Algorithm

Random Forest is a popular machine learning algorithm that belongs to the supervised learning
technique. It can be used for both Classification and Regression problems in ML.

It is based on the concept of ensemble learning, which is a process of combining multiple classifiers
to solve a complex problem and to improve the performance of the model.

As the name suggests, "Random Forest is a classifier that contains a number of decision trees on
various subsets of the given dataset and takes the average to improve the predictive accuracy of that
dataset."

Instead of relying on one decision tree, the random forest takes the prediction from each tree and based
on the majority votes of predictions, and it predicts the final output.

The greater number of trees in the forest leads to higher accuracy and prevents the problem of
overfitting.

The below diagram explains the working of the Random Forest algorithm:
Assumptions for Random Forest

Since the random forest combines multiple trees to predict the class of the dataset, it is possible that
some decision trees may predict the correct output, while others may not.

But together, all the trees predict the correct output. Therefore, below are two assumptions for a better
Random forest classifier:

o There should be some actual values in the feature variable of the dataset so that the classifier
can predict accurate results rather than a guessed result.
o The predictions from each tree must have very low correlations.

Why use Random Forest?

Below are some points that explain why we should use the Random Forest algorithm:

o It takes less training time as compared to other algorithms.

o It predicts output with high accuracy, even for the large dataset it runs efficiently.

o It can also maintain accuracy when a large proportion of data is missing.

How does Random Forest algorithm work?

Random Forest works in two-phase first is to create the random forest by combining N decision tree,
and second is to make predictions for each tree created in the first phase.

The Working process can be explained in the below steps and diagram:

Step-1: Select random K data points from the training set.

Step-2: Build the decision trees associated with the selected data points (Subsets).

Step-3: Choose the number N for decision trees that you want to build.

Step-4: Repeat Step 1 & 2.

Step-5: For new data points, find the predictions of each decision tree, and assign the new data points
to the category that wins the majority votes.

The working of the algorithm can be better understood by the below example:

Example: Suppose there is a dataset that contains multiple fruit images. So, this dataset is given to the
Random forest classifier. The dataset is divided into subsets and given to each decision tree. During the
training phase, each decision tree produces a prediction result, and when a new data point occurs, then
based on the majority of results, the Random Forest classifier predicts the final decision. Consider the
below image:
Applications of Random Forest

There are mainly four sectors where Random forest mostly used:

1. Banking: Banking sector mostly uses this algorithm for the identification of loan risk.
2. Medicine: With the help of this algorithm, disease trends and risks of the disease can be
identified.
3. Land Use: We can identify the areas of similar land use by this algorithm.
4. Marketing: Marketing trends can be identified using this algorithm.

Advantages of Random Forest

o Random Forest is capable of performing both Classification and Regression tasks.

o It is capable of handling large datasets with high dimensionality.

o It enhances the accuracy of the model and prevents the overfitting issue.

Disadvantages of Random Forest

o Although random forest can be used for both classification and regression tasks, it is not more
suitable for Regression tasks.

ML Notes
No ratings yet
ML Notes
202 pages
UNIT III_AIML
No ratings yet
UNIT III_AIML
47 pages
What is Machine Learning
No ratings yet
What is Machine Learning
10 pages
Eda 5
No ratings yet
Eda 5
48 pages
AI - Module-III (Introduction To ML)
No ratings yet
AI - Module-III (Introduction To ML)
20 pages
Unit 5
No ratings yet
Unit 5
26 pages
01. ML,Types,Application,Life Cycle,Issues
No ratings yet
01. ML,Types,Application,Life Cycle,Issues
29 pages
Unit-I Machine Leaning Notes
No ratings yet
Unit-I Machine Leaning Notes
13 pages
Unit3 - Updated
No ratings yet
Unit3 - Updated
116 pages
U20cs604 Machine Learning Unit I
No ratings yet
U20cs604 Machine Learning Unit I
33 pages
Machine Learning
100% (2)
Machine Learning
81 pages
ML 3
No ratings yet
ML 3
21 pages
ML Unit - 1
No ratings yet
ML Unit - 1
70 pages
Data Science IV
No ratings yet
Data Science IV
126 pages
7 Machine Learning Algirithms
No ratings yet
7 Machine Learning Algirithms
20 pages
Unit 1
No ratings yet
Unit 1
24 pages
Machine Learning-basic Concepts
No ratings yet
Machine Learning-basic Concepts
52 pages
Machine Learning Tutorial
100% (1)
Machine Learning Tutorial
44 pages
UNIT5
No ratings yet
UNIT5
15 pages
Unit 1
No ratings yet
Unit 1
22 pages
Learning
No ratings yet
Learning
24 pages
Machine Learning UNIT-3
100% (1)
Machine Learning UNIT-3
16 pages
ML Unit1.2
No ratings yet
ML Unit1.2
24 pages
UNIT-IV Notes
No ratings yet
UNIT-IV Notes
42 pages
ML Notes N
No ratings yet
ML Notes N
254 pages
O220880ppt_1
No ratings yet
O220880ppt_1
19 pages
Lecture bsmd -Introduction to ML
No ratings yet
Lecture bsmd -Introduction to ML
16 pages
ML UNIT I NEW
No ratings yet
ML UNIT I NEW
56 pages
ML UNIT 1
No ratings yet
ML UNIT 1
34 pages
Question 1: What Is Machine Learning Answer 1
No ratings yet
Question 1: What Is Machine Learning Answer 1
23 pages
Part-1 Introduction of ML
No ratings yet
Part-1 Introduction of ML
17 pages
ML Unit-1
No ratings yet
ML Unit-1
34 pages
Unit-1 Part-1 Material
No ratings yet
Unit-1 Part-1 Material
45 pages
ML@Chapter 1
No ratings yet
ML@Chapter 1
29 pages
Machine Learning
No ratings yet
Machine Learning
73 pages
Unit-3 ML Mech 3-2
No ratings yet
Unit-3 ML Mech 3-2
16 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
Article on Machine Learning
No ratings yet
Article on Machine Learning
4 pages
MACHINE LEARNING
No ratings yet
MACHINE LEARNING
97 pages
Unit1 - Machine Learning
No ratings yet
Unit1 - Machine Learning
17 pages
Notes - Machine Learning
No ratings yet
Notes - Machine Learning
138 pages
ML Links
No ratings yet
ML Links
176 pages
Machine Learning, History and Types of ML
No ratings yet
Machine Learning, History and Types of ML
18 pages
ML-Unit 1 Merged
No ratings yet
ML-Unit 1 Merged
151 pages
ML-Unit 1
No ratings yet
ML-Unit 1
43 pages
Chapter 1
No ratings yet
Chapter 1
27 pages
Module 4 & 5
No ratings yet
Module 4 & 5
58 pages
Unit-1 new
No ratings yet
Unit-1 new
48 pages
Machine Learning- UNIT I (1)
No ratings yet
Machine Learning- UNIT I (1)
70 pages
ML Notes
No ratings yet
ML Notes
15 pages
Machine Learning Tutorial
100% (1)
Machine Learning Tutorial
775 pages
R20 ML NOTES
No ratings yet
R20 ML NOTES
118 pages
Unit-I
No ratings yet
Unit-I
8 pages
Unit - 1, Notes
No ratings yet
Unit - 1, Notes
38 pages
UNIT I-Machine Learning
No ratings yet
UNIT I-Machine Learning
68 pages
MLCH1SEM6DLIHE
No ratings yet
MLCH1SEM6DLIHE
35 pages
Machine Learning Unit 1
No ratings yet
Machine Learning Unit 1
25 pages
Training Report On Machine Learning
No ratings yet
Training Report On Machine Learning
27 pages
Python Machine Learning Illustrated Guide For Beginners & Intermediates:The Future Is Here!
From Everand
Python Machine Learning Illustrated Guide For Beginners & Intermediates:The Future Is Here!
William Sullivan
4.5/5 (2)
Machine Learning for Absolute Beginners: An Introduction to the Fundamentals and Applications of Machine Learning
From Everand
Machine Learning for Absolute Beginners: An Introduction to the Fundamentals and Applications of Machine Learning
daniel huston
3/5 (1)
2023 142-7-395444 InternshipProjectReportFile
No ratings yet
2023 142-7-395444 InternshipProjectReportFile
44 pages
Software Defect Prevention
No ratings yet
Software Defect Prevention
6 pages
BME 6407 - Class 1 (April 2023)
No ratings yet
BME 6407 - Class 1 (April 2023)
43 pages
Thesis Floris Visser 406508fv
No ratings yet
Thesis Floris Visser 406508fv
80 pages
Part 1.2. Back Propagation
No ratings yet
Part 1.2. Back Propagation
30 pages
50 Data Science Interview Questions - by Hany Hossny, PHD - Medium
No ratings yet
50 Data Science Interview Questions - by Hany Hossny, PHD - Medium
5 pages
BZAN_6310-project_instructions
No ratings yet
BZAN_6310-project_instructions
4 pages
Ketulkumar Polara: Data Scientist Email: Phone
No ratings yet
Ketulkumar Polara: Data Scientist Email: Phone
6 pages
PyGAD-2 15 1
No ratings yet
PyGAD-2 15 1
203 pages
Data Mining in Data Analysis For Business Decision Support in Warehouse Management With Weka Program
No ratings yet
Data Mining in Data Analysis For Business Decision Support in Warehouse Management With Weka Program
8 pages
Data Mining UNIT-2 Notes
No ratings yet
Data Mining UNIT-2 Notes
91 pages
Machine Learning Roadmap
No ratings yet
Machine Learning Roadmap
35 pages
Power System Fault Classification and Prediction Based On A Three-Layer Data Mining Structure
No ratings yet
Power System Fault Classification and Prediction Based On A Three-Layer Data Mining Structure
18 pages
A Machine Learning Analysis of Stock Market Tick Data For Stock Price Trend Prediction
100% (1)
A Machine Learning Analysis of Stock Market Tick Data For Stock Price Trend Prediction
24 pages
Quantum Machine Learning An Intro
No ratings yet
Quantum Machine Learning An Intro
4 pages
PPT4 W3 S4 R0 Predictive Analytics I Data Mining Process
No ratings yet
PPT4 W3 S4 R0 Predictive Analytics I Data Mining Process
50 pages
Soft Computing CT QP
No ratings yet
Soft Computing CT QP
2 pages
A gradient boosting decision tree based GPS signal reception classification algorithm
No ratings yet
A gradient boosting decision tree based GPS signal reception classification algorithm
12 pages
Advanced Intelligent Systems - 2024 - Salih - A Perspective on Explainable Artificial Intelligence Methods SHAP and LIME
No ratings yet
Advanced Intelligent Systems - 2024 - Salih - A Perspective on Explainable Artificial Intelligence Methods SHAP and LIME
8 pages
Data Mining Project Report template
No ratings yet
Data Mining Project Report template
3 pages
Naïve Bayes Classifier (Week 8)
No ratings yet
Naïve Bayes Classifier (Week 8)
18 pages
Neural Networks 16 Mark Answers
No ratings yet
Neural Networks 16 Mark Answers
3 pages
Summary Business Analytics
No ratings yet
Summary Business Analytics
24 pages
MCS 207 PREVIOUS YEARS QUESTION WITH ANSWER
No ratings yet
MCS 207 PREVIOUS YEARS QUESTION WITH ANSWER
19 pages
07au Midterm
No ratings yet
07au Midterm
17 pages
CCS355 Neural Network and Deep Learning Notes Unit 2
No ratings yet
CCS355 Neural Network and Deep Learning Notes Unit 2
24 pages
Instant Access to Just Enough R An Interactive Approach to Machine Learning and Analytics 1st Edition Richard J. Roiger ebook Full Chapters
100% (4)
Instant Access to Just Enough R An Interactive Approach to Machine Learning and Analytics 1st Edition Richard J. Roiger ebook Full Chapters
55 pages
Explain in Detail Different Types of Machine Learning Models?
No ratings yet
Explain in Detail Different Types of Machine Learning Models?
14 pages
UNIT4 Confusion Matrix
No ratings yet
UNIT4 Confusion Matrix
12 pages
The Cross Entropy Method For Classification
No ratings yet
The Cross Entropy Method For Classification
8 pages

UNIT III;dkd

Uploaded by

UNIT III;dkd

Uploaded by

UNIT III ​ ​ ​ SUPERVISED LEARNING ​ ​ ​ ​ ​ 9

Introduction to Machine Learning

Introduction to Machine Learning

How does Machine Learning work

Features of Machine Learning:

o​ Machine learning uses data to detect various patterns in a given dataset.

o​ It can learn from past data and improve automatically.

Need for Machine Learning

o​ Rapid increment in the production of data

o​ Decision making in various sector including finance

o​ Finding hidden patterns and extracting useful information from data.

Classification of Machine Learning

At a broad level, machine learning can be classified into three types:

1.​ Supervised learning

Supervised learning can be grouped further in two categories of algorithms:

History of Machine Learning

Applications of Machine learning

o​ Average time has taken on past days at the same time.

o​ General blacklists filter

7. Virtual Personal Assistant:

These virtual assistants use machine learning algorithms as an important part.

8. Online Fraud Detection:

Stock Market trading:

10. Medical Diagnosis:

It helps in finding brain tumours and other brain-related diseases easily.

11. Automatic Language Translation:

Machine learning Life cycle

o​ Train the model

o​ Test the model

This step includes the below tasks:

o​ Identify various data sources

o​ Integrate the data obtained from different sources

This step can be further divided into two processes:

So, we use various filtering techniques to clean the data.

o​ Selection of analytical techniques

o​ Review the result

Difference between Artificial intelligence and Machine learning

On a broad level, we can differentiate both AI and ML as:

Based on capabilities, AI can be classified into three types:

Country Age Salary Purchased

France 43 45000 Yes

India 35 58000 Yes

Types of data in datasets

o​ Numerical data: Such as house price, temperature, etc.

o​ Categorical data: Such as Yes/No, True/False, Blue/green, etc.

o​ Gutenberg Task dataset

o​ IMDb film reviews dataset

Time Series Datasets:

o​ Securities exchange information

Regression Analysis in Machine learning

Some examples of regression can be as:

o​ Prediction of rain using temperature and other factors

o​ Determining Market trends

o​ Prediction of road accidents due to rash driving.

Terminologies Related to the Regression Analysis:

Linear Regression in Machine Learning

Y= Dependent Variable (Target Variable)​

Types of Linear Regression

Linear Regression Line

o​ Positive Linear Relationship:​

o​ Negative Linear Relationship:​

For the above linear equation, MSE can be calculated as:

N=Total number of observation​

Classification Algorithm in Machine Learning

What is the Classification Algorithm?

In classification algorithm, a discrete output function(y) is mapped to input variable(x).

1.​ y=f(x), where y = categorical output

The best example of an ML classification algorithm is Email Spam Detector.

Learners in Classification Problems:

In the classification problems, there are two types of learners:

Types of ML Classification Algorithms:

o​ Support Vector Machines

o​ Decision Tree Classification

o​ Random Forest Classification

Logistic Regression in Machine Learning

Logistic Function (Sigmoid Function):

UNIT III SUPERVISED LEARNING 9

o Machine learning uses data to detect various patterns in a given dataset.

o It can learn from past data and improve automatically.

o Rapid increment in the production of data

o Decision making in various sector including finance

o Finding hidden patterns and extracting useful information from data.

1. Supervised learning

o Average time has taken on past days at the same time.

o General blacklists filter

o Train the model

o Test the model

o Identify various data sources

o Integrate the data obtained from different sources

o Selection of analytical techniques

o Review the result

o Numerical data: Such as house price, temperature, etc.

o Categorical data: Such as Yes/No, True/False, Blue/green, etc.

o Gutenberg Task dataset

o IMDb film reviews dataset

o Securities exchange information

o Prediction of rain using temperature and other factors

o Determining Market trends

o Prediction of road accidents due to rash driving.

Y= Dependent Variable (Target Variable)

o Positive Linear Relationship:

o Negative Linear Relationship:

N=Total number of observation

1. y=f(x), where y = categorical output

o Support Vector Machines

o Decision Tree Classification

o Random Forest Classification

o The dependent variable must be categorical in nature.

o The independent variable should not have multi-collinearity.

o We know the equation of the straight line can be written as:

o It maximizes the distance between means of two classes.

o It minimizes the variance within the individual class.

o It is a graphical representation for getting all the possible solutions to a problem/decision

o S= Total number of samples

o P(yes)= probability of yes

o Gini index can be calculated using the below formula:

o Cost Complexity Pruning

o Reduced Error Pruning.

o It helps to think about all the possible outcomes for a problem.

o There is less requirement of data cleaning compared to other algorithms.

o The decision tree contains lots of layers, which makes it complex.

o It takes less training time as compared to other algorithms.

o It can also maintain accuracy when a large proportion of data is missing.

o Random Forest is capable of performing both Classification and Regression tasks.

o It is capable of handling large datasets with high dimensionality.