ML UNIT 1
ML UNIT 1
The term machine learning was first coined in the 1950s when Artificial
Intelligence pioneer Arthur Samuel built the first self-learning system for playing
checkers. He noticed that the more the system played, the better it performed.
Defn:
Machine learning (ML) is a branch of artificial intelligence (AI) that enables computers to “self-
learn” from training data and improve over time, without being explicitly programmed.
Various top companies such as Netflix and Amazon have built machine
learning models that are using a vast amount of data to analyse the user interest
and recommend product accordingly.
How do you think Google Maps predicts peaks in traffic and Netflix creates
personalized movie recommendations, even informs the creation of new
content ? By using machine learning, of course.
There are many different applications of machine learning, which can benefit
your business in countless ways. You’ll just need to define a strategy to help you
decide the best way to implement machine learning into your existing processes.
In the meantime, here are some common machine learning use cases and
applications that might spark some ideas:
Using machine learning you can monitor mentions of your brand on social
media and immediately identify if customers require urgent attention. By
detecting mentions from angry customers, in real-time, you can automatically
tag customer feedback and respond right away.
Machine learning allows you to integrate powerful text analysis tools with
customer support tools, so you can analyze your emails, live chats, and all
manner of internal data on the go. You can use machine learning to tag support
tickets and route them to the correct teams or auto-respond to common queries
so you never leave a customer in the cold.
Image Recognition
Virtual Assistants
Virtual assistants, like Siri, Alexa, Google Now, all make use of machine learning
to automatically process and answer voice requests. They quickly scan
information, remember related queries, learn from previous interactions, and
send commands to other apps, so they can collect information and deliver the
most effective answer.
Customer support teams are already using virtual assistants to handle phone
calls, automatically route support tickets, to the correct teams, and speed up
interactions with customers via computer-generated responses.
Product Recommendations
Associated rules can also be useful to plan a marketing campaign or analyze web
usage.
Medical Diagnosis
The ability of machines to find patterns in complex data is shaping the present
and future. Take machine learning initiatives during the COVID-19 outbreak, for
instance. AI tools have helped predict how the virus will spread over time, and
shaped how we control it. It’s also helped diagnose patients by analyzing lung
CTs and detecting fevers using facial recognition, and identified patients at a
higher risk of developing serious respiratory disease.
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
1.Supervised Learning
Overview:
Supervised learning is a type of machine learning that uses labeled data to train
machine learning models. In labeled data, the output is already known. The model
just needs to map the inputs to the respective outputs.
Attached below, you can see that we have our trained model that identifies the
picture of a cat.
Algorithms:
Logistic Regression
K Nearest Neighbor
Decision Tree
Random Forest
Naive Bayes
Working:
Supervised learning algorithms take labeled inputs and map them to the known
outputs, which means you already know the target variable.
Now, let’s focus on the training process for the supervised learning method.
Applications:
Supervised learning algorithms are generally used for solving classification and
regression problems.
Few of the top supervised learning applications are weather prediction, sales
forecasting, stock price analysis.
2.Unsupervised Learning
Overview:
Unsupervised learning is a type of machine learning that uses unlabeled data to train
machines. Unlabeled data doesn’t have a fixed output variable. The model learns
from the data, discovers the patterns and features in the data, and returns the
output.
Algorithms:
Selecting the right algorithm depends on the type of problem you are trying to solve.
Some of the common examples of unsupervised learning are:
K Means Clusterin
Hierarchical Clustering
DBSCAN
Working:
Unsupervised learning finds patterns and understands the trends in the data to
discover the output. So, the model tries to label the data based on the features of the
input data.
The training process used in unsupervised learning techniques does not need any
supervision to build models. They learn on their own and predict the output.
Applications:
Overview
Reinforcement Learning trains a machine to take suitable actions and maximize its
rewards in a particular situation. It uses an agent and an environment to produce
actions and rewards. The agent has a start and an end state. But, there might be
different paths for reaching the end state, like a maze. In this learning technique,
there is no predefined target variable.
Algorithms
1. Q-learning
2. Sarsa
3. Monte Carlo
4. Deep Q network
Working
Reinforcement learning follows trial and error methods to get the desired result. After
accomplishing a task, the agent receives an award. An example could be to train a
dog to catch the ball. If the dog learns to catch a ball, you give it a reward, such as a
biscuit.
Reinforcement learning problems are reward-based. For every task or for every step
completed, there will be a reward received by the agent. If the task is not achieved
correctly, there will be some penalty added.
Applications
Reinforcement learning algorithms are widely used in the gaming industries to build
games. It is also used to train robots to do human tasks.
LEARNING ASSOCIATIONS
Association rule mining finds interesting associations and relationships among
large sets of data items. This rule shows how frequently an itemset occurs in a
transaction. A typical example is a Market Based Analysis.
Market Based Analysis is one of the key techniques used by large relations to
show associations between items. It allows retailers to identify relationships between
the items that people buy together frequently.
Given a set of transactions, we can find rules that will predict the occurrence of
an item based on the occurrences of other items in the transaction.
TID Items
1 Bread, Milk
Before we start defining the rule, let us first see the basic definitions.
Support Count( ) – Frequency of occurrence of a itemset.
Classification
As we know, the Supervised Machine Learning algorithm can be broadly classified into
Regression and Classification Algorithms. In Regression algorithms, we have predicted
the output for continuous values, but to predict the categorical values, we need
Classification algorithms.
Classification Algorithm
The Classification algorithm is a Supervised Learning technique that is used to identify
the category of new observations on the basis of training data. In Classification, a
program learns from the given dataset or observations and then classifies new
observation into a number of classes or groups. Such as, Yes or No, 0 or 1, Spam or
Not Spam, cat or dog, etc. Classes can be called as targets/labels or categories.
Unlike regression, the output variable of Classification is a category, not a value, such
as "Green or Blue", "fruit or animal", etc. Since the Classification algorithm is a
Supervised learning technique, hence it takes labeled input data, which means it
contains input with the corresponding output.
The main goal of the Classification algorithm is to identify the category of a given
dataset, and these algorithms are mainly used to predict the output for the categorical
data.
Classification algorithms can be better understood using the below diagram. In the
below diagram, there are two classes, class A and Class B. These classes have features
that are similar to each other and dissimilar to other classes.
The algorithm which implements the classification on a dataset is known as a classifier.
There are two types of Classifications:
o Binary Classifier: If the classification problem has only two possible outcomes, then it
is called as Binary Classifier.
Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG, etc.
o Multi-class Classifier: If a classification problem has more than two outcomes, then it
is called as Multi-class Classifier.
Example: Classifications of types of crops, Classification of types of music.
1. Lazy Learners: Lazy Learner firstly stores the training dataset and wait until it receives
the test dataset. In Lazy learner case, classification is done on the basis of the most
related data stored in the training dataset. It takes less time in training but more time
for predictions.
Example: K-NN algorithm, Case-based reasoning
2. Eager Learners:Eager Learners develop a classification model based on a training
dataset before receiving a test dataset. Opposite to Lazy learners, Eager Learner takes
more time in learning, and less time in prediction. Example: Decision Trees, Naïve
Bayes, ANN.
Regression
Regression analysis is a statistical method to model the relationship between a
dependent (target) and independent (predictor) variables with one or more
independent variables. More specifically, Regression analysis helps us to understand
how the value of the dependent variable is changing corresponding to an independent
variable when other independent variables are held fixed. It predicts continuous/real
values such as temperature, age, salary, price, etc.
We can understand the concept of regression analysis using the below example:
Now, the company wants to do the advertisement of $200 in the year 2019 and wants
to know the prediction about the sales for this year. So to solve such type of
prediction problems in machine learning, we need regression analysis.
Unsupervised Learning
As the name suggests, unsupervised learning is a machine learning technique in which
models are not supervised using training dataset. Instead, models itself find the hidden
patterns and insights from the given data. It can be compared to learning which takes
place in the human brain while learning new things. It can be defined as:
Unsupervised learning is a type of machine learning in which models are trained using unlabeled
dataset and are allowed to act on that data without any supervision.
Here, we have taken an unlabeled input data, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabeled input data is fed to the
machine learning model in order to train it. Firstly, it will interpret the raw data to find
the hidden patterns from the data and then will apply suitable algorithms such as k-
means clustering, Decision tree, etc.
Once it applies the suitable algorithm, the algorithm divides the data objects into
groups according to the similarities and difference between the objects.
o Clustering: Clustering is a method of grouping the objects into clusters such that
objects with most similarities remains into a group and has less or no similarities with
the objects of another group. Cluster analysis finds the commonalities between the
data objects and categorizes them as per the presence and absence of those
commonalities.
o Association: An association rule is an unsupervised learning method which is used for
finding the relationships between variables in the large database. It determines the set
of items that occurs together in the dataset. Association rule makes marketing strategy
more effective. Such as people who buy X item (suppose a bread) are also tend to
purchase Y (Butter/Jam) item. A typical example of Association rule is Market Basket
Analysis.
In supervised learning, the training data provided to the machines work as the
supervisor that teaches the machines to predict the output correctly. It applies the
same concept as a student learns in the supervision of the teacher.
Supervised learning is a process of providing input data as well as correct output data
to the machine learning model. The aim of a supervised learning algorithm is to find
a mapping function to map the input variable(x) with the output variable(y).
In the real-world, supervised learning can be used for Risk Assessment, Image
classification, Fraud Detection, spam filtering, etc.
The working of Supervised learning can be easily understood by the below example
and diagram:
Suppose we have a dataset of different types of shapes which includes square,
rectangle, triangle, and Polygon. Now the first step is that we need to train the model
for each shape.
o If the given shape has four sides, and all the sides are equal, then it will be labelled as
a Square.
o If the given shape has three sides, then it will be labelled as a triangle.
o If the given shape has six equal sides then it will be labelled as hexagon.
Now, after training, we test our model using the test set, and the task of the model is
to identify the shape.
The machine is already trained on all types of shapes, and when it finds a new shape,
it classifies the shape on the bases of a number of sides, and predicts the output.
1. Regression
Regression algorithms are used if there is a relationship between the input variable
and the output variable. It is used for the prediction of continuous variables, such as
Weather forecasting, Market Trends, etc. Below are some popular Regression
algorithms which come under supervised learning:
o Linear Regression
o Non-Linear Regression
o Polynomial Regression
2. Classification
Classification algorithms are used when the output variable is categorical, which means
there are two classes such as Yes-No, Male-Female, True-false, etc.
Spam Filtering,
o Random Forest
o Decision Trees
o Support vector Machines
Advantages of Supervised learning:
o With the help of supervised learning, the model can predict the output on the basis of
prior experiences.
o Supervised learning model helps us to solve various real-world problems such as fraud
detection, spam filtering, etc.
Supervised learning model takes direct Unsupervised learning model does not
feedback to check if it is predicting take any feedback.
correct output or not.
Supervised learning model predicts the Unsupervised learning model finds the
output. hidden patterns in data.
Supervised learning model produces an Unsupervised learning model may give less
accurate result. accurate result as compared to supervised
learning.
Concept Learning
Concept Learning in Machine Learning can be thought of as a boolean-valued function
defined over a large set of training data. Taking a very simple example, one possible target
concept may be to Find the day when my friend Ramesh enjoys his favorite sport. We have
some attributes/features of the day like, Sky, Air
Temperature, Humidity, Wind, Water, Forecast and based on this we have a target Concept
named EnjoySport.
Task T: Learn to predict the value of EnjoySport for an arbitrary day, based on the
values of the attributes of the day.
h1(x=1): <Sunny, Warm, Normal, Strong, Warm, Same > Note: x=1 represents a
positive hypothesis / Positive example
We want to find the most suitable hypothesis which can represent the concept. For
example, Ramesh enjoys his favorite sport only on cold days with high
humidity (This seems independent of the values of the other attributes present in
the training examples).
Here ? indicates that any value of the attribute is acceptable. Note: The most generic
hypothesis will be < ?, ?, ?, ?, ?, ?> where every day is a positive example and the
most specific hypothesis will be <?,?,?,?,?,? > where no day is a positive example.
We will discuss the two most popular approaches to find a suitable hypothesis, they
are:
1. Find-S Algorithm
2. List-Then-Eliminate Algorithm
Find-S Algorithm:
Following are the steps for the Find-S algorithm:
Perceptron
Frank Rosenblatt (1928 – 1971) was an American psychologist notable in
the field of Artificial Intelligence.
Scientists had discovered that brain cells (Neurons) receive input from our
senses by electrical signals.
The Neurons, then again, use electrical signals to store information, and to
make decisions based on previous input.
Frank had the idea that Perceptrons could simulate brain principles, with
the ability to learn and make decisions.
The Perceptron
The original Perceptron was designed to take a number of binary inputs,
and produce one binary output (0 or 1).
Threshold = 1.5
x1 * w1 = 1 * 0.7 = 0.7
x2 * w2 = 0 * 0.6 = 0
x3 * w3 = 1 * 0.5 = 0.5
x4 * w4 = 0 * 0.3 = 0
x5 * w5 = 1 * 0.4 = 0.4
Return true if the sum > 1.5 ("Yes I will go to the Concert")
Linear separability
An ANN does not give an exact solution for a nonlinear problem. However, it
provides an approximate solution to nonlinear problems. Linear separability is the
concept wherein the separation of input space into regions is based on whether the
network response is positive or negative.
A decision line is drawn to separate positive and negative responses. The
decision line may also be called as the decision-making Line or decision-support Line
or linear-separable line. The necessity of the linear separability concept was felt to
clarify classify the patterns based upon their output responses.
Generally, the net input calculated to the output unit is given as –
yin=b+∑ni=1(xiwi)
The linear separability of the network is based on the decision-boundary line. If there
exist weight for which the training input vectors having a positive (correct) response,
or lie on one side of the decision boundary and all the other vectors having
negative, −1−1, response lies on the other side of the decision boundary then we can
conclude the problem is "Linearly Separable".
Consider, a single layer network as shown in the figure.
yin=b+x1w1+x2w2
The separating line for which the boundary lies between the values x1 and x2, so that
the net gives a positive response on one side and negative response on the other side,
is given as,
Linear Regression
Linear regression is one of the easiest and most popular Machine Learning algorithms.
It is a statistical method that is used for predictive analysis. Linear regression makes
predictions for continuous/real or numeric variables such as sales, salary, age,
product price, etc.
Linear regression algorithm shows a linear relationship between a dependent (y) and
one or more independent (y) variables, hence called as linear regression. Since linear
regression shows the linear relationship, which means it finds how the value of the
dependent variable is changing according to the value of the independent variable.
The linear regression model provides a sloped straight line representing the
relationship between the variables. Consider the below image:
y= a0+a1x+ ε
Here,
The values for x and y variables are training datasets for Linear Regression model
representation.
The different values for weights or the coefficient of lines (a0, a1) gives a different line
of regression, so we need to calculate the best values for a 0 and a1 to find the best fit
line, so to calculate this we use cost function.
Cost function-
o The different values for weights or coefficient of lines (a0, a1) gives the different line of
regression, and the cost function is used to estimate the values of the coefficient for
the best fit line.
o Cost function optimizes the regression coefficients or weights. It measures how a linear
regression model is performing.
o We can use the cost function to find the accuracy of the mapping function, which
maps the input variable to the output variable. This mapping function is also known
as Hypothesis function.
For Linear Regression, we use the Mean Squared Error (MSE) cost function, which is
the average of squared error occurred between the predicted values and actual values.
It can be written as:
Residuals: The distance between the actual value and predicted values is called
residual. If the observed points are far from the regression line, then the residual will
be high, and so cost function will high. If the scatter points are close to the regression
line, then the residual will be small and hence the cost function.