0% found this document useful (0 votes)
18 views

Model Development

Uploaded by

niti gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Model Development

Uploaded by

niti gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 80

Model Development

Dr. Vinay Chopra


Model development

 Model development is an iterative process, in which many models are derived, tested and
built upon until a model fitting the desired criteria is built.
 The first step toward model creation involves selecting the appropriate algorithm(s).
 These algorithms rely on prepared data to create and train the model.
 There are hundreds of machine learning algorithms that data scientists can access, and
new ones emerge every day.
 In producing a functional business tool, the correct algorithm and machine learning
problem must be in alignment.
 In this phase data science team needs to develop data sets for training, testing, and
production purposes.
 These data sets enable data scientist to develop analytical method and train it, while
holding aside some of data for testing the model.
Simple regression
Regression is the analysis of the relation between one variable and some other
variable(s), assuming a linear relation.
Also referred to as least squares regression and ordinary least squares (OLS)
a) The purpose is to explain the variation in a variable (that is, how a variable
differs from it's mean value) using the variation in one or more other variables.
b) Suppose we want to describe, explain, or predict why a variable differs from
its mean.
Let the ith observation on this variable be represented as Y i , and let n indicate the
number of observations.
Introduction
 Regression analysis is used when you want to predict a continuous dependent variable from a
number of independent variables.
 The independent variables used in regression can be either continuous or discontinuous.
Independent variables with more than two levels can also be used in regression analyses, but
they first must be converted into variables that have only two levels.
 One point to keep in mind with regression analysis is that causal relationships among the
variables cannot be determined.
 While the terminology is such that we say that X "predicts" Y, we cannot say that X "causes"
Y.
Assumptions of regression
 Number of cases
a) When doing regression, the cases-to-Independent Variables (IVs) ratio should
ideally be 20:1; that is 20 cases for every IV in the model.
b) The lowest your ratio should be is 5:1 (i.e., 5 cases for every IV in the model).
 Accuracy of data
a) If you have entered the data (rather than using an established dataset), it is a
good idea to check the accuracy of the data entry.
b) If you don't want to re-check each data point, you should at least check the
minimum and maximum value for each variable to ensure that all values for
each variable are "valid.“
c) For example, a variable that is measured using a 1 to 5 scale should not have
a value of 8.
Assumptions of regression
Missing data

a) You also want to look for missing data. If specific variables have a lot of missing
values, you may decide not to include those variables in your analyses.
b) If only a few cases have any missing values, then you might want to delete those
cases.
c) If there are missing values for several cases on different variables, then you
probably don't want to delete those cases (because a lot of your data will be lost).
d) If there are not too much missing data, and there does not seem to be any pattern
in terms of what is missing, then you don't really need to worry.
e) Just run your regression, and any cases that do not have values for the variables
used in that regression will not be included.
Assumptions of regression
 Outliers
a) You also need to check your data for outliers (i.e., an extreme value on a particular
item)
b) An outlier is often operationally defined as a value that is at least 3 standard
deviations above or below the mean.
c) If you feel that the cases that produced the outliers are not part of the same
"population" as the other cases, then you might just want to delete those cases.
 Normality
a) You also want to check that your data is normally distributed.
b) To do this, you can construct histograms and "look" at the data to see its
distribution.
c) Often the histogram will include a line that depicts what the shape would look like
if the distribution were truly normal (and you can "eyeball" how much the actual
distribution deviates from this line).
Simple linear regression

 Simple linear regression is used to estimate the relationship


between two quantitative variables.
 You can use simple linear regression when you want to know:
a) How strong the relationship is between two variables (e.g., the
relationship between rainfall and soil erosion).
b) The value of the dependent variable at a certain value of the
independent variable (e.g., the amount of soil erosion at a certain
level of rainfall).
Regression models

 Regression models describe the relationship between variables


by fitting a line to the observed data.
 Linear regression models use a straight line, while logistic and
nonlinear regression models use a curved line.
 Regression allows you to estimate how a dependent variable
changes as the independent variable(s) change.
Simple linear regression example

 You are a social researcher interested in the relationship between


income and happiness.
 You survey 500 people whose incomes range from 15k to 75k
and ask them to rank their happiness on a scale from 1 to 10.
 Your independent variable (income) and dependent variable
(happiness) are both quantitative, so you can do a regression
analysis to see if there is a linear relationship between them.
 If you have more than one independent variable, use
multiple linear regression instead.
Assumptions of simple linear regression

Simple linear regression is a parametric test, meaning that it makes


certain assumptions about the data. These assumptions are:
1) Homogeneity of variance (homoscedasticity): the size of the
error in our prediction doesn’t change significantly across the
values of the independent variable.
2) Independence of observations: the observations in the dataset
were collected using statistically valid sampling methods, and
there are no hidden relationships among observations.
3) Normality: The data follows a normal distribution.
Assumptions of simple linear regression

Linear regression makes one additional assumption:


4) The relationship between the independent and dependent variable
is linear:
the line of best fit through the data points is a straight line (rather
than a curve or some sort of grouping factor).
How to perform a simple linear regression

 The formula for a simple linear regression is:

a) y is the predicted value of the dependent variable (y) for any given value of the
independent variable (x).
b) is the intercept, the predicted value of y when the x is 0.
c) is the regression coefficient – how much we expect y to change as x increases.
d) X is the independent variable ( the variable we expect is influencing y).
e) ε is the error of the estimate, or how much variation there is in our estimate of the
regression coefficient.
How to perform a simple linear
regression
 Linear regression finds the line of best fit line through your data
by searching for the regression coefficient (B 1) that minimizes the
total error (e) of the model.
 While you can perform a linear regression by hand, this is a
tedious process, so most people use statistical programs to help
them quickly analyze the data.
Kinds of Linear Regression Model:-

 Simple Linear Regression: A linear regression model


with one independent and one dependent variable.
 MultipleLinear Regression: A linear regression model
with more than one independent variable and one
dependent variable.
Assumptions of Linear Regression

a) Sample size : n = 20 (cases per independent variable)


b) Heteroscedasticity is absent —
c) Linear Relationships exist between the variables.
d) Independent Sample observations.
e) No multicollinearity & auto-correlation
f) Independent Sample observations.
Polynomial Regression

 It is a type of Regression analysis that models the relationship of


values of the Dependent variable “x” and Independent variables
“y’’ as non-linear.
 It is a special case of Multiple Linear Regression even though it
fits a non-linear model to data.
 It is because data may be correlated but the relationship between
two variables might not look linear.
Logistic Regression

 Logistic Regression is a method that was used first in the field of


Biology in the 20th century.
 It is used to estimate the probability of certain events that are
mutually exclusive, for example, happy/sad, normal/abnormal, or
pass/fail.
 The value of probability strictly ranges between 0 and 1.
Quantile Regression

 Quantile Regression is an econometric technique that is used


when the necessary conditions to use Linear Regression are not
duly met.
 It is an extension of Linear Regression analysis i.e., we can use it
when outliers are present in data as its estimates strong against
outliers as compared to linear regression.
Ridge Regression

To understand Ridge Regression we first need to get through the concept of


Regularization.
 Regularization: There are two types of Regularization, L1 regularization & L2
regularization. L1 regularization adds an L1 penalty equal to the value of
coefficients to restrict the size of coefficients, which leads to the removal of
some coefficients. On the other hand, L2 regularization adds a penalty L2
which is equal to the square of coefficients.
Using the above method Regularization solves the problem of a scenario where
the model performs well on training data but underperforms on validation data.
Lasso Regression

 LASSO (Least Absolute Shrinkage and Selection Operator) is a


regression technique that was introduced first in geophysics.
 The term “Lasso” was coined by Professor Robert Tibshirani.
Just like Ridge Regression, it uses regularization to estimate the
results.
 Plus it also uses variable selection to make the model more
efficient.
APPLICATION OF REGRESSION ANALYSIS

 Forecasting:
a) Different types of regression analysis can be used to forecast future
opportunities and threats for a business.
b) For instance, a customer’s likely purchase volume can be predicted using a
demand analysis.
c) However, when it comes to business, demand isn’t the only variable that
affects profitability.
 Comparison with competition:
a) A company’s financial performance can be compared to that of a specific
competitor using this tool.
b) Also, it can be used to determine the correlation between the stock prices of
two different companies within the same industry or different industries.
APPLICATION OF REGRESSION ANALYSIS

 When compared to a rival company, it can help identify which factors are
influencing its sales. It can help small businesses achieve rapid success in a
short term.
 Problem Identification:
a) In addition to providing factual evidence, a regression can be used to identify
and correct judgment errors.
b) For example, a retail shop owner may believe that extending the hours of
operation will result in a significant increase in sales.
c) However, regression analysis shows that the monetary gains as a result of
increasing the working hours are not enough to offset the increase in
operational costs that comes along with it.
Regression analysis may provide the business owners with quantitative support
for their decisions and prevent them from making mistakes because of their
intuition.
 Decision Making:
a) Regression analysis (and other types of statistical analysis) are now being
used by many businesses and their top executives to make better business
decisions and reduce guesswork and intuition.
b) Scientific management is made possible by regression. Data overload is a
problem for both small and large organizations.
c) To make the best decisions possible, managers can use regression analysis to
sort through data and select relevant factors.
Simple Linear Regression

 Simple linear regression is when you want to predict values of one variable, given values of another
variable.
 For example, you might want to predict a person's height (in inches) from his weight (in pounds).
Imagine a sample of ten people for whom you know their height and weight.
 You could plot the values on a graph, with weight on the x axis and height on the y axis. If there were
a perfect linear relationship between height and weight, then all 10 points on the graph would fit on a
straight line.
 But, this is never the case (unless your data are rigged). If there is a (nonperfect) linear relationship
between height and weight (presumably a positive one), then you would get a cluster of points on the
graph which slopes upward.
 In other words, people who weigh a lot should be taller than those people who are of less weight.
 The purpose of regression analysis is to
come up with an equation of a line that
fits through that cluster of points with
the minimal amount of deviations from
the line.
 The deviation of the points from the line
is called "error." Once you have this
regression equation, if you knew a
person's weight, you could then predict
their height.
 Simple linear regression is actually the
same as a bivariate correlation between
the independent and dependent
variable.
Simple Regression

 The least squares principle is that the


regression line is determined by minimizing
the sum of the squares of the vertical
distances between the actual Y values and
the predicted values of Y.

 A line is fit through the XY points such that


the sum of the squared residuals (that is,
the sum of the squared the vertical
distance between the observations and the
line) is minimized.
The variables in a regression relation consist of dependent and independent variables.
a) The dependent variable is the variable whose variation is being explained by the other
variable(s). Also referred to as the explained variable, the endogenous variable, or the
predicted variable.
b) The independent variable is the variable whose variation is used to explain that of the
dependent variable. Also referred to as the explanatory variable, the exogenous variable, or
the predicting variable
Model Evaluation using Visualization
 Building machine learning models is often an exploratory and iterative process.
 Model building begins with a hypothesis about the underlying data, followed by
the construction and evaluation of a few models.
 Based on the results, the models are then iteratively refined.
 A data scientist might have to train hundreds of models for a single project,
tweaking hyper parameters and feature sets in order to find a model that meets
certain criteria.
 Despite the iterative nature of model building, there is currently no easy way to
manage all of the models built over time.
 Data scientists often resort to ad-hoc methods for organizing models, or end up
wasting valuable resources to regenerate old results.
 The issue here is one of model management, which is the tracking, storing and
indexing of large numbers of machine learning models so they may subsequently
be shared, queried, and analyzed.
Model Evaluation using Visualization

 An effective system for model management provides various


benefits on top of the basic improvements to organization.
 Such a system can provide an overview of previously built models,
allowing users to gain valuable insights about how to make
improvements.
 The ability to compare across models and visualize trends also
helps with the process of sense making, which involves getting a
better understanding of the underlying phenomenon being studied.
 Finally, model management promotes and facilitates collaboration,
so that teammates can easily build on top of each other’s work.
ModelDB

 ModelDB is a novel, end-to-end system for managing machine


learning models that aims to solve the problem of model management.
 The system automatically tracks models in their native machine
learning environments, intelligently indexes them by extracting and
storing relevant metadata, and provides a graphical user interface for
easy querying and visualization.
 ModelDB is able to manage the entire workflow, which includes all of
the steps from data preprocessing to training and testing.
Residual Plot
Residual Value
 A residual value is a measure of
how much a regression line
vertically misses a data point.
 Regression lines are the best fit of a
set of data.
 You can think of the lines as
averages; a few data points will fit
the line and others will miss.
 A residual plot has the
Residual Values on the vertical
axis; the horizontal axis displays
the independent variable.
Residual Plot

 A residual plot is typically used to find problems with regression.


 Some data sets are not good candidates for regression, including:
a) Heteroscedastic data (points at widely varying distances from the line).
b) Data that is non-linearly associated.
c) Data sets with outliers.
d) These problems are more easily seen with a residual plot than by looking
at a plot of the original data set.
e) Ideally, residual values should be equally and randomly spaced around the
horizontal axis.
 If your plot looks like any of the following images, then your data set is
probably not a good fit for regression.
a) The residual plot itself doesn’t have a predictive value (it isn’t a regression line), so if you
look at your plot of residuals and you can predict residual values that aren’t showing, that’s
a sign you need to rethink your model.
b) For example, in the image above, the quadratic function enables you to predict where other
data points might fall. For residual plots, that’s not a good thing.
c) If your plot indicates a problem, there can be several reasons why regression isn’t suitable.
It doesn’t always mean throwing out your model completely, it could be something simple,
like:
d) Missing higher-order variable terms that explain a non-linear pattern.
e) Missing interaction between terms in your existing model.
f) Missing variables.
Distribution plot

 The distribution plot is


suitable for comparing
range and distribution for
groups of numerical data.
Data is plotted as value
points along an axis.
 You can choose to display
only the value points to
see the distribution of
values, a bounding box to
see the range of values, or
a combination of both as
shown here:
Distribution plot

When to use it
 The distribution plot is suitable for comparing range and distribution for
groups of numerical data.
Advantages
 The distribution plot visualizes the distribution of data.
Disadvantages
 The distribution plot is not relevant for detailed analysis of the data as it
deals with a summary of the data distribution.
Creating a distribution plot
 You can create a distribution plot on the sheet you are editing.
 In a distribution plot you need to use one or two dimensions, and one measure. If you
use a single dimension you will receive a single line visualization. If you use two
dimensions, you will get one line for each value of the second, or outer, dimension.
Do the following:
a) From the assets panel, drag an empty distribution plot to the sheet.
b) Add the first dimension.
c) This is the inner dimension, which defines the value points.
d) Add a second dimension.
e) This is the outer dimension, which defines the groups of value points shown on the
dimension axis.
f) Click Add measure and create a measure from a field.
Viewing the distribution of measure valu
es in a dimension with a distribution plo
t
 This example shows how to make a
distribution plot to view the distribution of
measure values in a dimension, using
weather data as an example.
 Dataset
 In this example, we'll use the following
weather data.
a) Location: Sweden > Gällivare Airport
b) Date range: all data from 2010 to 2017
c) Measurement: Average of the 24 hourly
temperature observations in degrees
Celsius
d) The dataset that is loaded contains a daily
average temperature measurement from a
weather station in the north of Sweden
during the time period of 2010 to 2017.
Measure

 We use the average temperature measurement in the dataset as the measure, by creating a measure in Master
items with the name Temperature degrees Celsius, and the expression Avg([Average of the 24 hourly
temperature observations in degrees Celsius]).

Visualization

We add a distribution plot to the sheet and set the following data properties:

a) Dimension: Date (date) and Year (year). The order is important, Date needs to be the first dimension.

b) Measure: Temperature degrees Celsius, the measure that was created as a master item.

c) Distribution plot with the dimensions Date (date) and Year (year) and the measure Temperature degrees
Celsius.
Discovery

 The distribution plot visualizes the distribution of the daily


temperature measurements.
 The visualization is sorted by year, and each point represents a
temperature measurement.
 In the visualization we can see that the year 2012 has the lowest
extreme temperature measurement, close -40 degrees Celsius.
 We can also see that the year 2016 seems to have the largest
distribution of temperature measurements.
 With this many points in the distribution plot, it can be hard to
spot clusters and outliers, but the year 2017 has two low
temperature measurements that stand out.
 You can hover the mouse pointer over a point and view the
details.
Polynomial Regression Models

 A model is said to be linear when it is linear in parameters. So the model

are also the linear model.


a) In fact, they are the second-order polynomials in one and two variables
respectively.
b) The polynomial models can be used in those situations where the relationship
between study and explanatory variables is curvilinear.
c) Sometimes a nonlinear relationship in a small range of explanatory variable
can also be modelled by polynomials.
Polynomial Model in one variable
Extrapolation :the action of estimating or concluding something by assuming
that existing trends will continue or a current method will remain applicable.
Data Pipeline

 A data science pipeline is the set of processes that convert raw data into
actionable answers to business questions. Data science pipelines automate
the flow of data from source to destination, ultimately providing you insights
for making business decisions.
Benefits

 Data science pipelines automate the processes of data validation; extract, transform, load (
ETL); machine learning and modeling; revision; and output, such as to a data warehouse or
visualization platform. A type of data pipeline, data science pipelines eliminate many manual,
error-prone processes involved in transporting data between locations which can result in data
latency and bottlenecks.
 The benefits of a modern data science pipeline to your business:
 Easier access to insights, as raw data is quickly and easily adjusted, analyzed, and modeled
based on machine learning algorithms, then output as meaningful, actionable information
 Faster decision-making, as data is extracted and processed in real time, giving you up-to-date
information to leverage
 Agility to meet peaks in demand, as modern data science pipelines offer instant elasticity via
the cloud
DATA SCIENCE PIPELINE FLOW

 Generally, the primary processes of a data science pipeline are:


 Data engineering (including collection, cleansing, and preparation)
 Machine learning (model learning and model validation)
 Output (model deployment and data visualization)

But the first step in deploying a data science pipeline is identifying the business
problem you need the data to address and the data science workflow.
 Formulate questions you need answers to — that will direct the machine
learning and other algorithms to provide solutions you can use.
 Once that’s done, the steps for a data science pipeline are:
 Data collection, including the identification of data sources and extraction of
data from sources into usable formats
 Data preparation, which may include ETL
 Data modeling and model validation, in which machine learning is used to find
patterns and apply rules to the data via algorithms and then tested on sample
data
 Model deployment, applying the model to the existing and new data
 Reviewing and updating the model based on changing business requirements
CHARACTERISTICS OF A DATA SCIENCE PIP
ELINE
 A robust end-to-end data science pipeline can source, collect, manage, analyze,
model, and effectively transform data to discover opportunities and deliver cost-
saving business processes. Modern data science pipelines make extracting
information from the data you collect fast and accessible.
 To do this, the best data science pipelines have:
 Continuous, extensible data processing
 Cloud-enabled elasticity and agility
 Independent, isolated data processing resources
 Widespread data access and the ability to self-serve
 High availability and disaster recovery
 These characteristics enable organizations to leverage their data quickly,
accurately, and efficiently to make quicker and better business decisions.
BENEFITS OF A CLOUD PLATFORM FOR
DATA SCIENCE PIPELINES
A modern cloud data platform can satisfy the entire data lifecycle of a data science pipeline, including
machine learning, artificial intelligence, and predictive application development.
a) A cloud data platform provides:
b) Simplicity, making managing multiple compute platforms and constantly maintain integrations
unnecessary
c) Security, with one copy of data securely stored in the data warehouse environment and with user
credentials carefully managed and all transmissions encrypted
d) Performance, as query results are cached and can be used repeatedly during the machine learning
process, as well as for analytics
e) Workload isolation with dedicated compute resources for each user and workload
f) Elasticity, with scale-up capacity to accommodate large data processing tasks happening in
seconds
g) Support for structured and semi-structured data, making it easy to load, integrate, and analyze all
types of data inside a unified repository
h) Concurrency, as massive workloads run across shared data at scale
Evaluation Metrics in Machine Learning

 Evaluation is always good in any field, right? In the case of machine learning,
it is best practice. In this post, we will almost cover all the popular as well as
common metrics used for machine learning.
Classification Metrics

 In a classification task, our main task is to predict the target variable which is in the
form of discrete values. To evaluate the performance of such a model there are metrics
as mentioned below:
• Classification Accuracy
• Logarithmic loss
• Area under Curve
• F1 score
• Precision
• Recall
• Confusion Matrix
Classification Accuracy

 Classification accuracy is the accuracy we generally mean, whenever we use


the term accuracy. We calculate this by calculating the ratio of correct
predictions to the total number of input Samples.

It works great if there are an equal number of samples for each class. For
example, we have a 90% sample of class A and a 10% sample of class B in our
training set.
Then, our model will predict with an accuracy of 90% by predicting all the
training samples belonging to class A.
If we test the same model with a test set of 60% from class A and 40% from class
B. Then the accuracy will fall, and we will get an accuracy of 60%.
Logarithmic Loss

 It is also known as Log loss. Its basic working propaganda is by penalizing the
false (False Positive) classification.
 It usually works well with multi-class classification. Working on Log loss, the
classifier should assign a probability for each and every class of all the
samples.
 If there are N samples belonging to the M class, then we calculate the Log
loss in this way:
Area Under Curve(AUC)

 It is one of the widely used metrics and basically used for binary classification. The
AUC of a classifier is defined as the probability of a classifier will rank a randomly
chosen positive example higher than a negative example. Before going into AUC
more, let me make you comfortable with a few basic terms.
 True positive rate:
Also called or termed sensitivity. True Positive Rate is considered as a portion of positive
data points that are correctly considered as positive, with respect to all data points that are
positive.
 True Negative Rate

 Also called or termed specificity. False Negative Rate is considered as a


portion of negative data points that are correctly considered as negative,
with respect to all data points that are negatives.
 False-positive Rate
 False Negative Rate is considered as a portion of negative data points that are
mistakenly considered as negative, with respect to all data points that are
negative.
 False-positive Rate
 False Negative Rate is considered as a portion of negative data points that are
mistakenly considered as negative, with respect to all data points that are
negative.

a) False Positive Rate and True Positive Rate both have values in the range [0, 1].
b) Now the thing is what is A U C then? So, A U C is a curve plotted between False Positive
Rate Vs True Positive Rate at all different data points with a range of [0, 1].
c) Greater the value of AUCC better the performance of the model.
 F1 Score
 It is a harmonic mean between recall and precision. Its range is [0,1]. This metric
usually tells us how precise (It correctly classifies how many instances) and robust
(does not miss any significant number of instances) our classifier is.
 Precision
 There is another metric named Precision. Precision is a measure of a model’s
performance that tells you how many of the positive predictions made by the
model are actually correct. It is calculated as the number of true positive
predictions divided by the number of true positive and false positive
predictions.
 Recall
 Lower recall and higher precision give you great accuracy but then it misses a
large number of instances. The more the F1 score better will be performance.
It can be expressed mathematically in this way:
Confusion Matrix

 It creates a N X N matrix, where N is the number of classes or categories that


are to be predicted. Here we have N = 2, so we get a 2 X 2 matrix.
 Suppose there is a problem with our practice which is a binary classification.
Samples of that classification belong to either Yes or No. So, we build our
classifier which will predict the class for the new input sample.
 After that, we tested our model with 165 samples, and we get the following
result.
There are 4 terms you should keep in mind:

 True Positives: It is the case where we predicted Yes and the real output was
also yes.
 True Negatives: It is the case where we predicted No and the real output was
also No.
 False Positives: It is the case where we predicted Yes but it was actually No.
 False Negatives: It is the case where we predicted No but it was actually Yes.
 The accuracy of the matrix is always calculated by taking average values
present in the main diagonal i.e.
Regression Evaluation Metrics

 In the regression task, we are supposed to predict the target variable which is
in the form of continuous values. To evaluate the performance of such a
model below mentioned evaluation metrics are used:
 Mean Absolute Error
 Mean Squared Error
 Root Mean Square Error
 Root Mean Square Logarithmic Error
 R2 – Score
Mean Absolute Error(MAE)

 It is the average distance between Predicted and original values. Basically, it


gives how we have predicted from the actual output. However, there is one
limitation i.e. it doesn’t give any idea about the direction of the error which
is whether we are under-predicting or over-predicting our data. It can be
represented mathematically in this way:
Mean Squared Error(MSE)

 It is similar to mean absolute error but the difference is it takes the square of
the average of between predicted and original values. The main advantage to
take this metric is here, it is easier to calculate the gradient whereas, in the
case of mean absolute error, it takes complicated programming tools to
calculate the gradient. By taking the square of errors it pronounces larger
errors more than smaller errors, we can focus more on larger errors. It can be
expressed mathematically in this way.
Root Mean Square Error(RMSE)

 We can say that RMSE is a metric that can be obtained by just taking the
square root of the MSE value. As we know that the MSE metrics are not robust
to outliers and so are the RMSE values. This gives higher weightage to the
large errors in predictions.
R2 – Score

 The coefficient of determination also called the R2 score is used to evaluate


the performance of a linear regression model. It is the amount of variation in
the output-dependent attribute which is predictable from the input
independent variable(s). It is used to check how well-observed results are
reproduced by the model, depending on the ratio of total deviation of results
described by the model.

You might also like