Machine Learning Report
Machine Learning Report
Project Report
On
MACHINE LEARNING
TAKEN AT
“INTERNSHALA”
Bachelor of Technology
in
2021-22
Deepest thank to our Trainer Mr. Sarvesh Agarwal for his guidance,
good content, quality education and making such a good course for
understanding Machine learning. He has made some of the quizzes
and questions after some topics throughout the course so that I can
check my self-learning also.
Preface
Supervised Learning
In supervised learning, the computer is provided with example inputs
that are labeled with their desired outputs. The purpose of this method
is for the algorithm to be able to “learn” by comparing its actual
output with the “taught” outputs to find errors, and modify the model
accordingly. Supervised learning therefore uses pattern to predict
label values on additional unlabeled data.
A common use case of supervised learning is to use historical data to
predict statistically likely future events.
Unsupervised Learning
In unsupervised learning, data is unlabeled, so the learning algorithm
is left to find commonalities among its input data. As unlabeled data
are more abundant than labeled data, machine learning methods that
facilitate unsupervised learning are particularly valuable.
The goal of unsupervised learning may be as straightforward as
discovering hidden patterns within a dataset, but it may also have a
goal of feature learning, which allows the computational machine to
automatically discover the representations that are needed to classify
raw data.
Unsupervised learning is commonly used for transactional data. You
may have a large dataset of customers and their purchases, but as a
human you will likely not be able to make sense of what similar
attributes can be drawn from customer profiles and their types of
purchases. With this data fed into an unsupervised learning algorithm,
it may be determined that women of a certain age range who buy
unscented soaps are likely to be pregnant, and therefore a marketing
campaign related to pregnancy and baby products can be targeted to
this audience in order to increase their number of purchases.
Without being told a “correct” answer, unsupervised learning
methods can look at complex data that is more expansive and
seemingly unrelated in order to organize it in potentially meaningful
ways. Unsupervised learning is often used for anomaly detection
including for fraudulent credit card purchases, and recommender
systems that recommend what products to buy next.
Introduction to Bayesian
Decision Theory
Whether you are building Machine Learning models or making
decisions in everyday life, we always choose the path with the least
amount of risk. As humans, we are hardwired to take any action that
helps our survival; however, machine learning models are not initially
built with that understanding. These algorithms need to be trained and
optimized to choose the best option with the least amount of risk.
Additionally, it is important to know that some risky decisions can
lead to severe consequences if they are not correct.
Bayes’ Theorem
One of the most well-known equations in the world of statistics and
probability is Bayes’ Theorem (see formula below). The basic
intuition is that the probability of some class or event occurring,
given some feature (i.e., attribute), is calculated based on the
likelihood of the feature’s value and any prior information about the
class or event of interest. This seems like a lot to digest, so I will
break it down for you. First off, the case of cancer detection is a two-
class problem. The first class, Ω1, represents the event that a tumor is
present, and ω2 represents the event that a tumor is not present.
Parametric vs Nonparametric
Methods in Machine Learning
Parametric Methods
In parametric methods, we typically make an assumption with regards
to the form of the function f. For example, you could make an
assumption that the unknown function f is linear. In other words, we
assume that the function is of the form f(X) = β₀ + β₁ X₁ + … + βₚ
Xₚ
where f(X) is the unknown function to be estimated, β are the
coefficients to be learned, p is the number of independent variables
and X are the corresponding inputs.
Now that we have made an assumption about the form of the function
to be estimated and selected a model that aligns with this assumption,
we need a learning process that will eventually help us to train the
model and estimated the coefficients.
Linear Regression
Linear Regression is a machine learning algorithm based on
supervised learning. It performs a regression task. Regression models
a target prediction value based on independent variables. It is mostly
used for finding out the relationship between variables and
forecasting. Different regression models differ based on – the kind of
relationship between dependent and independent variables, they are
considering and the number of independent variables being used.
Hypothesis
When we see a pig, we shout “pig!” When it’s not a pig, we shout
“no, not pig!” After doing this several times with the child, we show
them a picture and ask “pig?” and they will correctly (most of the
time) say “pig!” or “no, not pig!” depending on what the picture is.
That is supervised machine learning.
Image showing how similar data points typically exist close to each other
Notice in the image above that most of the time, similar data points
are close to each other. The KNN algorithm hinges on this
assumption being true enough for the algorithm to be useful. KNN
captures the idea of similarity (sometimes called distance, proximity,
or closeness) with some mathematics we might have learned in our
childhood— calculating the distance between points on a graph.
Note: An understanding of how we calculate the distance between
points on a graph is necessary before moving on. If you are
unfamiliar with or need a refresher on how this
calculation is done, thoroughly read “Distance Between 2 Points” in
its entirety, and come right back.
There are other ways of calculating distance, and one way might be
preferable depending on the problem we are solving.
However, the straight-line distance (also called the Euclidean
distance) is a popular and familiar choice.
The predictions from each tree must have very low correlations.
Below are some points that explain why we should use the Random
Forest algorithm:
<="" li="">
It takes less training time as compared to other algorithms.
It predicts output with high accuracy, even for the large dataset it
runs efficiently.
It can also maintain accuracy when a large proportion of data is
missing.
Step-2: Build the decision trees associated with the selected data
points (Subsets).
Step-3: Choose the number N for decision trees that you want to
build.
Step-5: For new data points, find the predictions of each decision
tree, and assign the new data points to the category that wins the
majority votes.
When making Decision Trees, there are several factors we must take
into consideration: On what features do we make our decisions on?
What is the threshold for classifying each question into a yes or no
answer? In the first Decision Tree, what if we wanted to ask
ourselves if we had friends to play with or not. If we have friends, we
will play every time. If not, we might continue to ask ourselves
questions about the weather. By adding an additional question, we
hope to greater define the Yes and No classes.
An Introduction to Gradient
Boosting Decision Trees
Gradient boosting works by building simpler (weak) prediction
models sequentially where each model tries to predict the error left
over by the previous model. Because of this, the algorithm tends to
overfit rather quick.
But what is a weak learning model? A model that does slightly better
than random predictions.
This tutorial will take you through the concepts behind gradient
boosting and also through two practical implementations of the
algorithm:
Decision trees
A decision tree is a machine learning model that builds upon
iteratively asking questions to partition data and reach a solution. It is
the most intuitive way to zero in on a classification or label for an
object. Visually too, it resembles and upside-down tree with
protruding branches and hence the name.
For example, if you went hiking, and saw an animal that you couldn’t
immediately recognize through its features. You could later come
home and ask yourself a set of questions about its features which
could help you decide what exact species of animal did you notice. A
decision tree for this problem would look something like this.
WHY BOOSTING?
In boosting, weak learner are used which perform only slightly better
than a random chance.
Boosting focuses on sequentially adding up these weak learners and
filtering out the observations that a learner gets correct at every step.
Basically, the stress is on developing new weak learners to handle the
remaining difficult observations at each step.
AdaBoost
The aim was to put stress on the difficult to classify instances for
every new weak learner. Further, the final result was average of
weighted outputs from all individual learners. The weights associated
with outputs were proportional to their accuracy.
Overfitting in the model can only be detected once you test the data.
To detect the issue, we can perform Train/test split.
In the train-test split of the dataset, we can divide our dataset into
random test and training datasets. We train the model with a training
dataset which is about 80% of the total dataset. After training the
model, we test it with the test dataset, which is 20 % of the total
dataset.
Underfitting
Underfitting occurs when our machine learning model is not able to
capture the underlying trend of the data. To avoid the overfitting in
the model, the fed of training data can be stopped
at an early stage, due to which the model may not learn enough from
the training data. As a result, it may fail to find the best fit of the
dominant trend in the data.
In the case of underfitting, the model is not able to learn enough from
the training data, and hence it reduces the accuracy and produces
unreliable predictions.
An underfitted model has high bias and low variance.
Introduction to Regularization
Machine Learning
Regularization is that the method of adding data so as to resolve an
ill-posed drawback or to forestall overfitting. It applies to objective
functions in ill-posed improvement issues. Often, a regression model
overfits the information it’s coaching upon. During the method of
regularization, we tend to try and cut back the complexness of the
regression operate while not really reducing the degree of the
underlying polynomial operate. Regularization are often intended as a
method to enhance the generalizability of a learned model. In this
topic, we are going to learn about Regularization Machine Learning.
However, the penalty here is that the total of the squared values
of L2 Regularization or Ridge Regularization conjointly add a
penalty to the weights.
Cybersecurity:
Education:
Job opportunities:
Search engines:
If the current state of ML is exciting since it is the near future of
machine learning opens significantly more and highly complicated
chances for technologists. Let us look at these one-by-one Machine
learning is the process of automatically getting insights from data that
can drive business value.
Lavanya Tekumalla
Gathering and preparing large volumes of data that the machine will
use to teach itself. Feeding the data into ML models and training
them to make right decisions through supervision and correction.
Deploying the model to make analytical predictions or feed with new
kinds of data to expand its capabilities.
CONCLUSION
This tutorial has introduced you to Machine Learning. Now, you
know that Machine Learning is a technique of training machines to
perform the activities a human brain can do, albeit bit faster and
better than an average human-being. Today we have seen that the
machines can beat human champions in games such as Chess,
AlphaGO, which are considered very complex. You have seen that
machines can be trained to perform human activities in several areas
and can aid humans in living better lives.
Machine Learning can be a Supervised or Unsupervised. If you have
lesser amount of data and clearly labelled data for training, opt for
Supervised Learning. Unsupervised Learning would generally give
better performance and results for large data sets. If you have a huge
data set easily available, go for deep learning techniques. You also
have learned Reinforcement Learning and Deep Reinforcement
Learning. You now know what Neural Networks are, their
applications and limitations.