0% found this document useful (0 votes)
25 views

Module 1 & 2

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Module 1 & 2

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Machine Learning Algorithms

Machine Learning algorithms are the programs that can learn the hidden
patterns from the data, predict the output, and improve the performance
from experiences on their own. Different algorithms can be used in
machine learning for different tasks, such as simple linear regression that
can be used for prediction problems like stock market
prediction, and the KNN algorithm can be used for classification
problems.

In this topic, we will see the overview of some popular and most
commonly used machine learning algorithms along with their use cases
and categories.

Types of Machine Learning Algorithms


Machine Learning Algorithm can be broadly classified into three types:

1. Supervised Learning Algorithms


2. Unsupervised Learning Algorithms
3. Reinforcement Learning algorithm

The below diagram illustrates the different ML algorithm, along with the
categories:
1) Supervised Learning Algorithm
Supervised learning is a type of Machine learning in which the machine
needs external supervision to learn. The supervised learning models are
trained using the labeled dataset. Once the training and processing are
done, the model is tested by providing a sample test data to check
whether it predicts the correct output.

The goal of supervised learning is to map input data with the output data.
Supervised learning is based on supervision, and it is the same as when a
student learns things in the teacher's supervision. The example of
supervised learning is spam filtering.

Supervised learning can be divided further into two categories of problem:

o Classification
o Regression

Examples of some popular supervised learning algorithms are Simple


Linear regression, Decision Tree, Logistic Regression, KNN algorithm,
etc. Read more..

2) Unsupervised Learning Algorithm


It is a type of machine learning in which the machine does not need any
external supervision to learn from the data, hence called unsupervised
learning. The unsupervised models can be trained using the unlabelled
dataset that is not classified, nor categorized, and the algorithm needs to
act on that data without any supervision. In unsupervised learning, the
model doesn't have a predefined output, and it tries to find useful insights
from the huge amount of data. These are used to solve the Association
and Clustering problems. Hence further, it can be classified into two
types:

o Clustering
o Association

Examples of some Unsupervised learning algorithms are K-means


Clustering, Apriori Algorithm, Eclat, etc. Read more..

3) Reinforcement Learning
In Reinforcement learning, an agent interacts with its environment by
producing actions, and learn with the help of feedback. The feedback is
given to the agent in the form of rewards, such as for each good action,
he gets a positive reward, and for each bad action, he gets a negative
reward. There is no supervision provided to the agent. Q-Learning
algorithm is used in reinforcement learning. Read more…

List of Popular Machine Learning Algorithm


1. Linear Regression Algorithm
2. Logistic Regression Algorithm
3. Decision Tree
4. SVM
5. Naïve Bayes
6. KNN
7. K-Means Clustering
8. Random Forest
9. Apriori
10. PCA

1. Linear Regression
Linear regression is one of the most popular and simple machine learning
algorithms that is used for predictive analysis. Here, predictive
analysis defines prediction of something, and linear regression makes
predictions for continuous numbers such as salary, age, etc.

It shows the linear relationship between the dependent and independent


variables, and shows how the dependent variable(y) changes according to
the independent variable (x).

It tries to best fit a line between the dependent and independent


variables, and this best fit line is knowns as the regression line.

The equation for the regression line is:

y= a0+ a*x+ b

Here, y= dependent variable

x= independent variable

a0 = Intercept of line.

Linear regression is further divided into two types:

o Simple Linear Regression: In simple linear regression, a single


independent variable is used to predict the value of the dependent
variable.
o Multiple Linear Regression: In multiple linear regression, more
than one independent variables are used to predict the value of the
dependent variable.

The below diagram shows the linear regression for prediction of weight
according to height: Read more..
2. Logistic Regression
Logistic regression is the supervised learning algorithm, which is used
to predict the categorical variables or discrete values. It can be
used for the classification problems in machine learning, and the output of
the logistic regression algorithm can be either Yes or NO, 0 or 1, Red or
Blue, etc.

Logistic regression is similar to the linear regression except how they are
used, such as Linear regression is used to solve the regression problem
and predict continuous values, whereas Logistic regression is used to
solve the Classification problem and used to predict the discrete values.

Instead of fitting the best fit line, it forms an S-shaped curve that lies
between 0 and 1. The S-shaped curve is also known as a logistic function
that uses the concept of the threshold. Any value above the threshold will
tend to 1, and below the threshold will tend to 0. Read more..

3. Decision Tree Algorithm


A decision tree is a supervised learning algorithm that is mainly used to
solve the classification problems but can also be used for solving the
regression problems. It can work with both categorical variables and
continuous variables. It shows a tree-like structure that includes nodes
and branches, and starts with the root node that expand on further
branches till the leaf node. The internal node is used to represent
the features of the dataset, branches show the decision
rules, and leaf nodes represent the outcome of the problem.

Some real-world applications of decision tree algorithms are identification


between cancerous and non-cancerous cells, suggestions to customers to
buy a car, etc. Read more..
4. Support Vector Machine Algorithm
A support vector machine or SVM is a supervised learning algorithm that
can also be used for classification and regression problems. However, it is
primarily used for classification problems. The goal of SVM is to create a
hyperplane or decision boundary that can segregate datasets into
different classes.

The data points that help to define the hyperplane are known as support
vectors, and hence it is named as support vector machine algorithm.

Some real-life applications of SVM are face detection, image


classification, Drug discovery, etc. Consider the below diagram:

As we can see in the above diagram, the hyperplane has classified


datasets into two different classes. Read more..

5. Naïve Bayes Algorithm:


Naïve Bayes classifier is a supervised learning algorithm, which is used to
make predictions based on the probability of the object. The algorithm
named as Naïve Bayes as it is based on Bayes theorem, and follows
the naïve assumption that says' variables are independent of each other.

The Bayes theorem is based on the conditional probability; it means the


likelihood that event(A) will happen, when it is given that event(B) has
already happened. The equation for Bayes theorem is given as:
Naïve Bayes classifier is one of the best classifiers that provide a good
result for a given problem. It is easy to build a naïve bayesian model, and
well suited for the huge amount of dataset. It is mostly used for text
classification. Read more..

6. K-Nearest Neighbour (KNN)


K-Nearest Neighbour is a supervised learning algorithm that can be used
for both classification and regression problems. This algorithm works by
assuming the similarities between the new data point and available data
points. Based on these similarities, the new data points are put in the
most similar categories. It is also known as the lazy learner algorithm as it
stores all the available datasets and classifies each new case with the
help of K-neighbours. The new case is assigned to the nearest class with
most similarities, and any distance function measures the distance
between the data points. The distance function can be Euclidean,
Minkowski, Manhattan, or Hamming distance, based on the
requirement. Read more..

7. K-Means Clustering
K-means clustering is one of the simplest unsupervised learning
algorithms, which is used to solve the clustering problems. The datasets
are grouped into K different clusters based on similarities and
dissimilarities, it means, datasets with most of the commonalties remain
in one cluster which has very less or no commonalities between other
clusters. In K-means, K-refers to the number of clusters, and means refer
to the averaging the dataset in order to find the centroid.

It is a centroid-based algorithm, and each cluster is associated with a


centroid. This algorithm aims to reduce the distance between the data
points and their centroids within a cluster.

This algorithm starts with a group of randomly selected centroids that


form the clusters at starting and then perform the iterative process to
optimize these centroids' positions.

It can be used for spam detection and filtering, identification of fake news,
etc. Read more..

8. Random Forest Algorithm


Random forest is the supervised learning algorithm that can be used for
both classification and regression problems in machine learning. It is an
ensemble learning technique that provides the predictions by combining
the multiple classifiers and improve the performance of the model.

It contains multiple decision trees for subsets of the given dataset, and
find the average to improve the predictive accuracy of the model. A
random-forest should contain 64-128 trees. The greater number of trees
leads to higher accuracy of the algorithm.

To classify a new dataset or object, each tree gives the classification


result and based on the majority votes, the algorithm predicts the final
output.

Random forest is a fast algorithm, and can efficiently deal with the
missing & incorrect data. Read more..

9. Apriori Algorithm
Apriori algorithm is the unsupervised learning algorithm that is used to
solve the association problems. It uses frequent itemsets to generate
association rules, and it is designed to work on the databases that contain
transactions. With the help of these association rule, it determines how
strongly or how weakly two objects are connected to each other. This
algorithm uses a breadth-first search and Hash Tree to calculate the
itemset efficiently.

The algorithm process iteratively for finding the frequent itemsets from
the large dataset.

The apriori algorithm was given by the R. Agrawal and Srikant in the
year 1994. It is mainly used for market basket analysis and helps to
understand the products that can be bought together. It can also be used
in the healthcare field to find drug reactions in patients. Read more..

10. Principle Component Analysis


Principle Component Analysis (PCA) is an unsupervised learning
technique, which is used for dimensionality reduction. It helps in reducing
the dimensionality of the dataset that contains many features correlated
with each other. It is a statistical process that converts the observations of
correlated features into a set of linearly uncorrelated features with the
help of orthogonal transformation. It is one of the popular tools that is
used for exploratory data analysis and predictive modeling.

PCA works by considering the variance of each attribute because the high
variance shows the good split between the classes, and hence it reduces
the dimensionality.
Some real-world applications of PCA are image processing, movie
recommendation system, optimizing the power allocation in various
communication channels. Read more..

Probability and Statistics for Machine


Learning
Probability and statistics both are the most important concepts for
Machine Learning. Probability is about predicting the likelihood of future
events, while statistics involves the analysis of the frequency of past
events.

Nowadays, Machine Learning has become one of the first choices for most
freshers and IT professionals. But, in order to enter this field, one must
have some pre-specified skills and one of those skills in Mathematics. Yes,
Mathematics is very much important to learn ML technology and develop
efficient applications for the business. When talking about mathematics
for Machine Learning, it especially focuses on Probability and Statistics,
which are the essential topics to get started with ML. Probability and
statistics are considered as the base foundation for ML and data science
to develop ML algorithms and build decision-making capabilities. Also,
Probability and statistics are the primary prerequisites to learn ML.

In this topic, we will discuss a few important books on Probability and


statistics that help you in making the ML process easy and implementing
algorithms to business scenarios too. Here, we will discuss some of the
best books for Probability and Statistics from basic to advanced levels.

Probability in Machine Learning


Probability is the bedrock of ML, which tells how likely is the event to
occur. The value of Probability always lies between 0 to 1. It is the core
concept as well as a primary prerequisite to understanding the ML models
and their applications.

Probability can be calculated by the number of times the event


occurs divided by the total number of possible outcomes. Let's
suppose we tossed a coin, then the probability of getting head as a
possible outcome can be calculated as below formula:

P (H) = Number of ways to head occur/ total number of possible outcomes

P (H) = ½

P (H) = 0.5
Where;

P (H) = Probability of occurring Head as outcome while tossing a coin.

Types of Probability
For better understanding the Probability, it can be categorized further in
different types as follows:

Empirical Probability: Empirical Probability can be calculated as the


number of times the event occurs divided by the total number of incidents
observed.

Theoretical Probability:Theoretical Probability can be calculated as the


number of ways the particular event can occur divided by the total
number of possible outcomes.

Joint Probability:It tells the Probability of simultaneously occurring two


random events.

P(A ∩ B) = P(A). P(B)

Where;

P(A ∩ B) = Probability of occurring events A and B both.

P (A) = Probability of event A

P (B) = Probability of event B

Conditional Probability:It is given by the Probability of event A given


that event B occurred.

The Probability of an event A conditioned on an event B is denoted and


defined as;

P(A|B) = P(A∩B)/P(B)

Similarly, P(B|A) = P(A ∩ B)/ P(A) . We can write the joint Probability of as
A and B as P(A ∩ B)= p(A).P(B|A), which means: "The chance of both
things happening is the chance that the first one happens, and then the
second one is given when the first thing happened."

We have a basic understanding of Probability required to learn Machine


Learning. Now, we will discuss the basic introduction of Statistics for ML.
Statistics in Machine Learning
Statistics is also considered as the base foundation of machine learning
which deals with finding answers to the questions that we have about
data. In general, we can define statistics as:

Statistics is the part of applied Mathematics that deals with studying and developing
ways for gathering, analyzing, interpreting and drawing conclusion from empirical data.
It can be used to perform better-informed business decisions.

Statistics can be categorized into 2 major parts. These are as follows:

o Descriptive Statistics
o Inferential Statistics

Use of Statistics in ML
Statistics methods are used to understand the training data as well as
interpret the results of testing different machine learning models. Further,
Statistics can be used to make better-informed business and investing
decisions.

Probabilistic Models in Machine Learning

Machine learning algorithms today rely heavily on probabilistic


models, which take into consideration the uncertainty inherent in
real-world data. These models make predictions based on
probability distributions, rather than absolute values, allowing for
a more nuanced and accurate understanding of complex systems.
One common approach is Bayesian inference, where prior
knowledge is combined with observed data to make predictions.
Another approach is maximum likelihood estimation, which seeks
to find the model that best fits observational data.
What are Probabilistic Models?
Probabilistic models are an essential component of machine
learning, which aims to learn patterns from data and make
predictions on new, unseen data. They are statistical models that
capture the inherent uncertainty in data and incorporate it into
their predictions. Probabilistic models are used in various
applications such as image and speech recognition, natural
language processing, and recommendation systems. In recent
years, significant progress has been made in developing
probabilistic models that can handle large datasets efficiently.
Categories Of Probabilistic Models
These models can be classified into the following categories:
 Generative models
 Discriminative models.
 Graphical models
Generative models:
Generative models aim to model the joint distribution of the input
and output variables. These models generate new data based on
the probability distribution of the original dataset. Generative
models are powerful because they can generate new data that
resembles the training data. They can be used for tasks such as
image and speech synthesis, language translation, and text
generation.
Discriminative models
The discriminative model aims to model the conditional
distribution of the output variable given the input variable. They
learn a decision boundary that separates the different classes of
the output variable. Discriminative models are useful when the
focus is on making accurate predictions rather than generating
new data. They can be used for tasks such as image recognition,
speech recognition, and sentiment analysis.
Graphical models
These models use graphical representations to show the
conditional dependence between variables. They are commonly
used for tasks such as image recognition, natural language
processing, and causal inference.

Naive Bayes Algorithm in Probabilistic


Models
The Naive Bayes algorithm is a widely used approach in
probabilistic models, demonstrating remarkable efficiency and
effectiveness in solving classification problems. By leveraging the
power of the Bayes theorem and making simplifying assumptions
about feature independence, the algorithm calculates the
probability of the target class given the feature set. This method
has found diverse applications across various industries, ranging
from spam filtering to medical diagnosis. Despite its simplicity,
the Naive Bayes algorithm has proven to be highly robust,
providing rapid results in a multitude of real-world problems.
Naive Bayes is a probabilistic algorithm that is used for
classification problems. It is based on the Bayes theorem of
probability and assumes that the features are conditionally
independent of each other given the class. The Naive Bayes
Algorithm is used to calculate the probability of a given sample
belonging to a particular class. This is done by calculating the
posterior probability of each class given the sample and then
selecting the class with the highest posterior probability as the
predicted class.
The algorithm works as follows:
1. Collect a labeled dataset of samples, where each sample has a
set of features and a class label.
2. For each feature in the dataset, calculate the conditional
probability of the feature given the class.
3. This is done by counting the number of times the feature
occurs in samples of the class and dividing by the total number
of samples in the class.
4. Calculate the prior probability of each class by counting the
number of samples in each class and dividing by the total
number of samples in the dataset.
5. Given a new sample with a set of features, calculate the
posterior probability of each class using the Bayes theorem and
the conditional probabilities and prior probabilities calculated in
steps 2 and 3.
6. Select the class with the highest posterior probability as the
predicted class for the new sample.
Probabilistic Models in Deep Learning
Deep learning, a subset of machine learning, also relies on
probabilistic models. Probabilistic models are used to optimize
complex models with many parameters, such as neural networks.
By incorporating uncertainty into the model training process,
deep learning algorithms can provide higher accuracy and
generalization capabilities. One popular technique is variational
inference, which allows for efficient estimation of posterior
distributions.
Importance of Probabilistic Models
 Probabilistic models play a crucial role in the field of machine
learning, providing a framework for understanding the
underlying patterns and complexities in massive datasets.
 Probabilistic models provide a natural way to reason about the
likelihood of different outcomes and can help us understand
the underlying structure of the data.
 Probabilistic models help enable researchers and practitioners
to make informed decisions when faced with uncertainty.
 Probabilistic models allow us to perform Bayesian inference,
which is a powerful method for updating our beliefs about a
hypothesis based on new data. This can be particularly useful
in situations where we need to make decisions under
uncertainty.
Advantages Of Probabilistic Models
 Probabilistic models are an increasingly popular method in
many fields, including artificial intelligence, finance, and
healthcare.
 The main advantage of these models is their ability to take into
account uncertainty and variability in data. This allows for more
accurate predictions and decision-making, particularly in
complex and unpredictable situations.
 Probabilistic models can also provide insights into how different
factors influence outcomes and can help identify patterns and
relationships within data.
Disadvantages Of Probabilistic Models
There are also some disadvantages to using probabilistic models.
 One of the disadvantages is the potential for overfitting, where
the model is too specific to the training data and doesn’t
perform well on new data.
 Not all data fits well into a probabilistic framework, which can
limit the usefulness of these models in certain applications.
 Another challenge is that probabilistic models can be
computationally intensive and require significant resources to
develop and implement.

Decision Tree Classification Algorithm


o Decision Tree is a Supervised learning technique that can be used for
both classification and Regression problems, but mostly it is preferred for
solving Classification problems. It is a tree-structured classifier,
where internal nodes represent the features of a dataset,
branches represent the decision rules and each leaf node
represents the outcome.
o In a Decision tree, there are two nodes, which are the Decision
Node and Leaf Node. Decision nodes are used to make any decision and
have multiple branches, whereas Leaf nodes are the output of those
decisions and do not contain any further branches.
o The decisions or the test are performed on the basis of features of the
given dataset.
o It is a graphical representation for getting all the possible
solutions to a problem/decision based on given conditions.
o It is called a decision tree because, similar to a tree, it starts with the root
node, which expands on further branches and constructs a tree-like
structure.
o In order to build a tree, we use the CART algorithm, which stands
for Classification and Regression Tree algorithm.
o A decision tree simply asks a question, and based on the answer (Yes/No),
it further split the tree into subtrees.
o Below diagram explains the general structure of a decision tree:

Note: A decision tree can contain categorical data (YES/NO) as well as numeric
data.

Why use Decision Trees?


There are various algorithms in Machine learning, so choosing the best
algorithm for the given dataset and problem is the main point to
remember while creating a machine learning model. Below are the two
reasons for using the Decision tree:

o Decision Trees usually mimic human thinking ability while making a


decision, so it is easy to understand.
o The logic behind the decision tree can be easily understood because it
shows a tree-like structure.

Decision Tree Terminologies


 Root Node: Root node is from where the decision tree starts. It represents the entire dataset,
which further gets divided into two or more homogeneous sets.

 Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further
after getting a leaf node.

 Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes
according to the given conditions.

 Branch/Sub Tree: A tree formed by splitting the tree.

 Pruning: Pruning is the process of removing the unwanted branches from the tree.

 Parent/Child node: The root node of the tree is called the parent node, and other nodes are
called the child nodes.

How does the Decision Tree algorithm Work?

In a decision tree, for predicting the class of the given dataset, the
algorithm starts from the root node of the tree. This algorithm compares
the values of root attribute with the record (real dataset) attribute and,
based on the comparison, follows the branch and jumps to the next node.

For the next node, the algorithm again compares the attribute value with
the other sub-nodes and move further. It continues the process until it
reaches the leaf node of the tree. The complete process can be better
understood using the below algorithm:

o Step-1: Begin the tree with the root node, says S, which contains the
complete dataset.
o Step-2: Find the best attribute in the dataset using Attribute Selection
Measure (ASM).
o Step-3: Divide the S into subsets that contains possible values for the
best attributes.
o Step-4: Generate the decision tree node, which contains the best
attribute.
o Step-5: Recursively make new decision trees using the subsets of the
dataset created in step -3. Continue this process until a stage is reached
where you cannot further classify the nodes and called the final node as a
leaf node.

Example: Suppose there is a candidate who has a job offer and wants to
decide whether he should accept the offer or Not. So, to solve this
problem, the decision tree starts with the root node (Salary attribute by
ASM). The root node splits further into the next decision node (distance
from the office) and one leaf node based on the corresponding labels. The
next decision node further gets split into one decision node (Cab facility)
and one leaf node. Finally, the decision node splits into two leaf nodes
(Accepted offers and Declined offer). Consider the below diagram:

Attribute Selection Measures


While implementing a Decision tree, the main issue arises that how to
select the best attribute for the root node and for sub-nodes. So, to solve
such problems there is a technique which is called as Attribute
selection measure or ASM. By this measurement, we can easily select
the best attribute for the nodes of the tree. There are two popular
techniques for ASM, which are:
o Information Gain
o Gini Index

1. Information Gain:

o Information gain is the measurement of changes in entropy after the


segmentation of a dataset based on an attribute.
o It calculates how much information a feature provides us about a class.
o According to the value of information gain, we split the node and build the
decision tree.
o A decision tree algorithm always tries to maximize the value of
information gain, and a node/attribute having the highest information gain
is split first. It can be calculated using the below formula:

1. Information Gain= Entropy(S)- [(Weighted Avg) *Entropy(each feature)

Entropy: Entropy is a metric to measure the impurity in a given attribute.


It specifies randomness in data. Entropy can be calculated as:

Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)

Where,

o S= Total number of samples


o P(yes)= probability of yes
o P(no)= probability of no

2. Gini Index:

o Gini index is a measure of impurity or purity used while creating a decision


tree in the CART(Classification and Regression Tree) algorithm.
o An attribute with the low Gini index should be preferred as compared to
the high Gini index.
o It only creates binary splits, and the CART algorithm uses the Gini index to
create binary splits.
o Gini index can be calculated using the below formula:

Gini Index= 1- ∑jPj2


Pruning: Getting an Optimal Decision tree
Pruning is a process of deleting the unnecessary nodes from a tree in
order to get the optimal decision tree.

A too-large tree increases the risk of overfitting, and a small tree may not
capture all the important features of the dataset. Therefore, a technique
that decreases the size of the learning tree without reducing accuracy is
known as Pruning. There are mainly two types of tree pruning technology
used:

o Cost Complexity Pruning


o Reduced Error Pruning.

Advantages of the Decision Tree


o It is simple to understand as it follows the same process which a human
follow while making any decision in real-life.
o It can be very useful for solving decision-related problems.
o It helps to think about all the possible outcomes for a problem.
o There is less requirement of data cleaning compared to other algorithms.

Disadvantages of the Decision Tree


o The decision tree contains lots of layers, which makes it complex.
o It may have an overfitting issue, which can be resolved using the Random
Forest algorithm.
o For more class labels, the computational complexity of the decision tree
may increase.

MODULE 2 SUPERVISED LEARNING

Introduction to Linear Models


The linear model is one of the most simple models in machine learning. It assumes that the
data is linearly separable and tries to learn the weight of each feature. Mathematically, it can
be written as Y=WTXY=WTX, where X is the feature matrix, Y is the target variable, and
W is the learned weight vector. We apply a transformation function or a threshold for the
classification problem to convert the continuous-valued variable Y into a discrete category.
Here we will briefly learn linear and logistic regression, which are
the regression and classification task models, respectively.

Linear models in machine learning are easy to implement and interpret and are helpful in
solving many real-life use cases.

Types of Linear Models


Among many linear models, this article will cover linear regression and logistic regression.

Linear Regression

Linear Regression is a statistical approach that predicts the result of a response variable by
combining numerous influencing factors. It attempts to represent the linear connection
between features (independent variables) and the target (dependent variables). The cost
function enables us to find the best possible values for the model parameters. A detailed
discussion on linear regression is presented in a different article.

Example: An analyst would be interested in seeing how market movement influences the
price of ExxonMobil (XOM). The value of the S&P 500 index will be the independent
variable, or predictor, in this example, while the price of XOM will be the dependent
variable. In reality, various elements influence an event's result. Hence, we usually have
many independent features.

Logistic Regression

Logistic regression is an extension of linear regression. The sigmoid function first transforms
the linear regression output between 0 and 1. After that, a predefined threshold helps to
determine the probability of the output values. The values higher than the threshold value
tend towards having a probability of 1, whereas values lower than the threshold value tend
towards having a probability of 0. A separate article dives deeper into the mathematics
behind the Logistic Regression Model.

Example: A bank wants to predict if a customer will default on their loan based on their
credit score and income. The independent variables would be credit score and income, while
the dependent variable would be whether the customer defaults (1) or not (0).

Applications of Linear Models


Several real-life scenarios follow linear relations between dependent and independent
variables. Some of the examples are:

 The relationship between the boiling point of water and change in altitude.
 The relationship between spending on advertising and the revenue of an organization.
 The relationship between the amount of fertilizer used and crop yields.
 Performance of athletes and their training regimen.

You might also like