0% found this document useful (0 votes)

1 views

Business Analytics DGV

The document provides an overview of business analytics, focusing on statistics and various types of data analysis including descriptive, diagnostic, predictive, and prescriptive analytics. It categorizes data into quantitative (discrete and continuous) and qualitative (nominal and rank) types, and discusses different sampling techniques and potential errors in sampling. Additionally, it covers measures of central tendency and dispersion, essential for understanding data distributions.

Uploaded by

priya24laasya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

Business Analytics DGV

Uploaded by

priya24laasya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 104

Business Analytics

Statistics and Analytics 2

Statistics is the mathematics of estimating parameters of

populations based on data from representative samples of those
populations

Analytics is broad term, which may refer to almost any type of data
analysis, especially statistical analysis, data mining and machine
learning.
Analytics 3

Analytics is a generic word without a specific

meaning that can apply to virtually any form of
data analysis, especially statistical analysis, data
mining, and artificial intelligence
4
Analytics divided in to four levels 5

▪ Descriptive
▪ Diagnostic
▪ Predictive
▪ Prescriptive
Scales of measurement 6

• They are ordered with their

increasing
• Accuracy
• Powerfulness of
measurement
• Preciseness
• Wide application of
statistical techniques
Scales of measurement 7
Different types of data 8

In statistics, data are classified into two broad categories:

• Quantitative / Numerical • Qualitative / Categorical

– Discrete – Nominal

– Continuous – Rank
Quantitative / Numerical data 9

Continuous data represent the numerical values of a continuous variable. A continuous

variable is the one that can assume any value between any two points on a line
segment, thus representing an interval of values.

All characteristics such as weight, length, height, thickness, temperature, Time of arrival,
Marks scored, etc., represent continuous variables

It may be noted that a continuous variable assumes the finest unit of measurement.
Finest in the sense that it enables measurements to the maximum degree of precision.
Quantitative / Numerical data 10

Discrete data are the values assumed by a discrete variable. A discrete variable is the
one whose outcomes are measured in fixed numbers. Such data are essentially count
data.

These are derived from a process of counting, such as the number of items possessing
or not possessing a certain characteristic.

The number of orders received in a year, number of defective products produced,

number of employees left the organization last year, Number of people passed the
exam are all examples of discrete data.
Qualitative / Categorical 11

Nominal data are the outcome of classification into two or more categories of items or
units comprising a sample or a population according to some quality characteristic.

Classification of Customers according to geography, students according to sex,

Workers according to skill (as skilled, semi-skilled, and unskilled), employees according
to the level of education (as undergraduates, and post-graduates), all result into
nominal data.

Given any such basis of classification, it is always possible to assign each item to a
particular class and make a summation of items belonging to each class. The count
data so obtained are called nominal data.
Qualitative / Categorical 12

Rank data, on the other hand, are the result of assigning ranks to specify order in terms
of the integers 1,2,3, ..., n. Ranks may be assigned according to the level of
performance in a test. a contest, a competition, an interview, or a show.

The candidates appearing in an interview, for example, may be assigned ranks in

integers ranging from I to n, depending on their performance in the interview. Ranks so
assigned can be viewed as the values of a variable involving performance as the
quality characteristic.
Classification of data 13

• Univariate

• Bivariate

• Multivariate
Univariate data 14

Univariate Data is data that concerns only one variable. The data
concerning the Weights of a Finance class of 30 students presented in the
following table is an example of Univariate Data

Individual Weight (kg)

1 75
2 71
3 82

30 78
Bivariate data 15

Bivariate Data is data concerning only two variables. Continuing with our
earlier example, if we add the Height of each student along with his/her
Weight, it will be a bivariate data.

Individual Weight (kg) Height (cms)

1 75 172
2 71 169
3 82 174
-
-
30 78 176
Multivariate data 16

Multivariate Data is data concerning more than two variables.

Individual Weight (kgs) Height(cms) Marks in a Exam(max 100) Gender

1 75 172 80 Male
2 71 169 75 Male
3 82 174 82 Female

30 78 176 69 Male
Data Sources 17

Data sources could be seen as of two types, viz., secondary and primary.

1. Secondary data

• They already exist in some form: published or unpublished - in an

identifiable secondary source. They are, generally, available from
published source(s), though not necessarily in the form actually required.

• Examples: Customer data available in ERP, Satisfaction scores of previous

years, Consumer price index data of last 10 years available in government
website.
Data Sources 18

Data sources could be seen as of two types, viz., secondary and primary…

2. Primary data

• Those data which do not already exist in any form, and thus have to be
collected for the first time from the primary source(s). By their very nature,
these data require fresh and first-time collection covering the whole
population or a sample drawn from it.

• Examples: Customer satisfaction survey, Market research for a new

product, Real-time performance data of sales team etc.,
Data Collection methods 19

• Surveys

• Focus Group Discussions

• Interviews

• Experiments

• Observations
Classification of Sampling techniques 20

Sampling Techniques

Nonprobability Probability
Sampling Techniques Sampling Techniques

Convenience Judgmental Quota Snowball

Sampling Sampling Sampling Sampling

Simple Random Systematic Stratified Cluster Other Sampling

Sampling Sampling Sampling Sampling Techniques
Non Probability sampling 21

Convenience sampling attempts to obtain a sample of convenient

elements. Interviews are conducted at locations/places where our target
population is likely to be.

 Interview at shop front

 Street corner interviews

 In one or two friendly neighborhood

Non Probability sampling 22

Judgmental sampling is a form of convenience sampling in which the

population elements are selected based on the judgment of an Expert.

The judgment of the expert is about the appropriateness of the

respondent unit for the purpose of the study
Non Probability sampling 23

Quota sampling is judgmental sampling with the constraint that the sample
includes a minimum number of specified sub-groups.
Non Probability sampling 24

In snowball sampling, an initial group of respondents is selected, usually at

random.

After being interviewed, these respondents are asked to identify others

who belong to the target population of interest.

Subsequent respondents are

selected based on the referrals

 Commonly used for low penetration products, High value products

(e.g. Club members, Audi owners)
Probability sampling – Simple Random 25

 Each element in the population has a known and equal probability of

selection.

 Each possible sample of a given size (n) has a known and equal probability
of being the sample actually selected.

Random number generation is a common

method to achieve Simple Random Sampling
Probability sampling – Systematic sampling 26

 The sample is chosen by selecting a random starting point and then picking
every ith element in succession from the sampling frame.

The sampling interval, i, is determined by dividing

the population size N by the sample size n and
rounding to the nearest integer.

 For example, there are 100,000 elements in the population and a sample of
1,000 is desired. In this case the sampling interval, i, is 100. A random number
between 1 and 100 is selected. If, for example, this number is 64, the sample
consists of elements 64, 164, 264, 364, 464, 564, and so on.
Probability sampling – Stratified sampling 27

 A two-step process in which the population is first partitioned into subpopulations, or

strata, and then a sample is drawn from each stratum.

Rules Of Strata

The strata should be mutually exclusive and collectively

exhaustive in that every population element should be
assigned to one and only one stratum and no
population elements should be omitted

The elements within a stratum should be as

homogeneous as possible, but the elements in different
strata should be as heterogeneous as possible.

The stratification variables should also be closely related

to the characteristic under study.
Probability sampling – Stratified sampling 28

A two-step process in which the population is first partitioned into subpopulations, or

strata,
The second step involves selecting elements from each
stratum by a random procedure, usually SRS, or
systematic sampling

In proportionate stratified sampling, the size of the

sample drawn from each stratum is proportionate to
the relative size of that stratum in the total population.

In disproportionate stratified sampling, the size of the

sample from each stratum is proportionate to the
standard deviation of the distribution of the
characteristic of interest among all the elements in that
stratum. The more homogenous a stratum is the lesser
the sample size required in that stratum.
Probability sampling – Cluster Sampling 29

 The target population is first divided into mutually exclusive and collectively
exhaustive subpopulations, or clusters.

 In stage 2 a random sample of clusters is selected, based on a probability

sampling technique such as SRS.

 In stage 3 ,for each selected cluster, either all the elements are included in
the sample (one-stage) or a sample of elements is drawn probabilistically
(two-stage).
Probability sampling – Cluster Sampling 30

The target population is first divided into mutually exclusive and collectively
exhaustive subpopulations, or clusters.

Cluster Rules

Population Elements within a cluster should be as heterogeneous as possible, but

clusters themselves should be as homogeneous as possible.

Clusters
Ideally, each cluster should be a small-scale representation of
the population.

In stage 2 a random sample of clusters is selected, based on a

Samples probability sampling technique such as SRS.

In stage 3 ,for each selected cluster, either all the elements

are included in the sample (one-stage) or a sample of
elements is drawn probabilistically (two-stage).
Sampling Error
31

◼ Sampling errors:
❖ Faulty selection of sample – This may be due to defective sampling technique.
Purposive or Judgment sampling, in which reseacher deliberately selects non-
representative sample

❖ Substitution – Sometimes an investigator substitutes a convenient member of

population

❖ Faulty demarcation of sampling units

❖ Variability of the population

Sampling Error
32

◼ Non -Sampling errors or Bias:

❖ This may be due to human factors which always varies from
one investigator to other
❖ Negligence and carelessness

❖ Faulty planning of sampling

❖ Faulty selection of sample units

❖ Error in compilation

❖ Wrong statistical measure

Sampling Error
33
The error arising from drawing inferences on the basis of observations
on a part (sample) is termed as Sampling Error. It decreases with
increase in sample size. Normally, after certain stage, increase in
sample size does not result in substantial reduction in error. The
optimum sample size is worked out based on this behaviour, taking
into account the required precision and cost consideration.
Error

Sample Size
Branches of Statistics 34
35
Statistics 36

Descriptive Statistics
Mean 490.8
Standard Error 6.542348114
Median 475
Mode 450
Standard Deviation 54.73721146
Sample Variance 2996.162319
Kurtosis -0.334093298
Skewness 0.924330473
Range 190
Minimum 425
Maximum 615
Sum 34356
Count 70
Types of measures 37

◼ Measures of Central Tendency

◼ Measures of Dispersion

◼ Measures of asymmetry (skewness)

◼ Measures of relationship
Measures of Central tendency 38

◼ Measures of Central Tendency

◼ Describes the center position of the data

◼ Mean, Median, Mode

Measures of Central tendency 39
Arrange the
data in
65 100 69 91 72 85 72 84 75 descending
order

100 91 85 84 75 72 72 69 65

Mode
Median

Mean = 79.22
What measure to use: Mean, Median, Mode
40

* Mode may not be a good representation if the data set is not normal
Measures of Dispersion 41

◼ Measures of Dispersion
◼ Describes the spread of the data (how
scores are scattered or dispersed)

◼ Range, Variance, Standard deviation,

Interquartile Range
Measures of Dispersion 42

Range
◼ The range is calculated by taking the maximum value and
subtracting the minimum value.

2 4 6 8 10 12 14 Range = 14 - 2 = 12

◼ If we divide range or spread of scores into four equal parts, these are called
“quartiles”

◼ When we divide range into 10 equal parts, these are called “deciles”

◼ When we divide range into 100 equal parts, these are called “percentiles”
Measures of Dispersion 43

Percentiles
Arrange the data in ascending order.

Compute index i, the position of the pth percentile.

i = (p/100)n

If i is not an integer, round up. The p th percentile

is the value in the i th position.

If i is an integer, the p th percentile is the average

of the values in positions i and i +1.
Measures of Dispersion 44

80th Percentile
◼ Example: Apartment Rents
i = (p/100)n = (80/100)70 = 56

80th Percentile = 56th data = 535

425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

Note: Data is in ascending order.

Measures of Dispersion 45

Quartiles

◼ Quartiles are specific percentiles.

◼ First Quartile = 25th Percentile
◼ Second Quartile = 50th Percentile = Median
◼ Third Quartile = 75th Percentile
Measures of Dispersion 46

Third Quartile
◼ Example: Apartment Rents
Third quartile = 75th percentile
i = (p/100)n = (75/100)70 = 52.5 = 53
Third quartile = 525

IQR
◼ Example: Apartment Rents
3rd Quartile (Q3) = 525
1st Quartile (Q1) = 445
Interquartile Range = Q3 - Q1 = 525 - 445 = 80

Note: Data is in ascending order.

Identifying Outliers 48

The lower limit is located 1.5 (IQR) below Q1

Lower limit: Q1 – 1.5 (IQR) = 445 – 1.5 (80)

= 445 – 120
= 325

The upper limit is located 1.5 (IQR) above Q3

Upper limit: Q3 + 1.5 (IQR) = 525 + 1.5 (80)

= 525 + 120
= 645

There are no outliers (values less than 325 and

above 645) in the data given
Identifying Outliers & Box whisker plot 49

Activity: Excel, demo

Construct a Box Whisker plot

for the data given in excel
file and identify the outliers
IQR 50

The interquartile range or the quartile deviation is a better

measure of variation in a distribution than the range. Here,
avoiding the 25 percent of the distribution at both the ends
uses the middle 50 percent of the distribution

Many times the interquartile range is reduced in the form of semi-

interquartile range or quartile deviation as shown below:
Semi interquartile range or Quartile deviation = (Q3 – Ql)/2
IQR 51

When quartile deviation is small, it means that there is a small deviation

in the central 50 percent items. In contrast, if the quartile deviation is
high, it shows that the central 50 percent items have a large variation. It
may be noted that in a symmetrical distribution, the two quartiles, that
is, Q3 and QI are equidistant from the median.

Since it is not influenced by the extreme values in a distribution, it is

particularly suitable in highly skewed or erratic distributions.
Measures of Dispersion 52

Standard Deviation
A measure of how widely the data points tend to diverge from the mean. A small
standard deviation indicates most values are close to the mean, and a large
standard deviation indicates they are much more or much less than the mean. The
basic idea is that you’d like to sum up how different the individual data points are
from the average. You could just sum up the individual differences, but what about
the fact that some are less than the mean and others are greater? That would tend
to make them cancel out. The way to get around that is to square the differences,
because any time you square a number, the result is positive. Later, after we have
added them together, we take a square root, to reduce the value down to
something more manageable and reasonable
Measures of Dispersion 53
Measures of Dispersion 54

If the n observation in a sample are denoted by x1 , x2, x3

………………. Xn Then the variance is given by

n 2
 (xi x)
S2 = i =1
n 1
The Standard deviation, S is the positive square root of the variance.
Measures of Dispersion 55

1) If the Standard deviation is a relatively small number, the

variability in the data about the average is small i.e. the data will
be more nearly clustered near the average.

2) if the standard deviation is a relatively large number, there is

more variability in the data and the data will be spread out more
i.e. the data will be more away from the average.
Types of Distributions 56

◼ Discrete theoretical distributions

◼ Binomial distribution
◼ Poisson distribution
◼ Rectangular distribution
◼ Multinomial distribution etc.,

◼ Continuous theoretical distributions

◼ Normal distribution
◼ Students t-distribution
◼ Chi-square distribution
◼ F-distribution
Histogram 57
Normal Distribution 58
Properties of Normal Distribution:

μ + 2σ

μ + 3σ
μ - 3σ

μ - 2σ

μ+σ
μ-σ

μ
1) Normal Distribution is completely designated by two
parameters (μ and σ)
2) μ used for location and σ for spread.
3) Normal curve is bell shaped.
Normal Distribution 59

Properties of Normal Distribution (Contd):

4) Normal distribution is symmetric around μ.

5) Normal distribution extends from -∞ to ∞. But for all practicable
purposes the region μ - 3σ to μ + 3σ covers most of the distribution.
6) The mean, mode and median will be the same for this curve.
7) Area under the normal curve is equal to 1
8) 68.26% of values lie within 1σ limits.
∓
∓
9) 95.44% of values lie within 2σ limits.
∓
10)99.73% of values lie within 3σ limits.
Normal Distribution 60
Non Normal Distribution 61
Asymmetry 62

◼ Measures of asymmetry
◼ When the distribution of item in a series happens to be perfectly
symmetrical, we have normal distribution. Such a curve is perfectly bell
shaped. But if the curve is distorted (whether on the right side or on the
left side) we have asymmetrical distribution which indicates that there is
skewness.

◼ If the curve is distorted on the towards left we have negative skewness

and vice versa
Asymmetry 63

◼ Skewness is, thus, a measure of asymmetry and shows the manner in

which the items are clustered around the average

◼ The difference between the mean, median or mode provides an

easy way of expressing skewness in a series
Asymmetry 64

An important measure of the shape of a distribution is called skewness.

The formula for the skewness of sample data is

 xi − x 
3
n
Skewness =  
(n − 1)(n − 2)  s 


Skewness can be easily computed using statistical software.

Asymmetry 65

Symmetric (not skewed)

• Skewness is zero.
• Mean and median are equal.
Skewness = 0
.35
.30
Relative Frequency

.25
.20
.15
.10
.05
0
Parametric Test 66

Parametric statistics make assumptions (such as normality) about the population

values (called parameter)

For example, one assumption for the one way ANOVA is that the data comes from a
normal distribution. If your data isn’t normally distributed, you can’t run an ANOVA,
but you can run the nonparametric alternative—the Kruskal-Wallis test.
Non parametric Test 67

A non parametric test (sometimes called a distribution free test) does not assume
anything about the underlying distribution (for example, that the data comes from
a normal distribution). That’s compared to parametric test, which makes assumptions
about a population’s parameters (for example, the mean or standard deviation);
When the word “non parametric” is used in stats, it doesn’t quite mean that you
know nothing about the population. It usually means that you know the population
data does not have a normal distribution.
Non parametric vs Parametric Test 68

NONPARAMETRIC TEST PARAMETRIC ALTERNATIVE

1-sample sign test One-sample Z-test, One sample t-test

1-sample Wilcoxon Signed Rank test One sample Z-test, One sample t-test
Friedman test Two-way ANOVA
Kruskal-Wallis test One-way ANOVA

Mann-Whitney test Independent samples t-test

Mood’s Median test One-way ANOVA

Spearman Rank Correlation Correlation Coefficient

Test of Hypothesis 69

What is hypothesis:
Ordinarily, when one talks about hypothesis, one simply means mere
assumption or some supposition to be proved or disproved.

But for researcher hypothesis is a formal question that he intends to resolve.

Basic concepts on testing hypothesis 70

1. Null and alternative hypothesis:

If we are to compare method A with method B about its superiority and if
we proceed on the assumption that both are equally good, then this
assumption is termed as the null hypothesis. As against this, we may think
that method A is superior or method B is inferior, we are then stating what is
termed as alternative hypothesis

Null hypothesis is generally symbolized as Ho

Alternative hypothesis as Ha
Basic concepts on testing hypothesis 71

The null hypothesis states that there is no difference between

groups or no relationship between variables.

When your sample contains sufficient evidence, you can reject the null
and conclude that the effect is statistically significant. Statisticians often
denote the null hypothesis as H0 or HA.

• Null Hypothesis H0: No effect exists in the population.

• Alternative Hypothesis HA: The effect exists in the population.
Understanding variables 72

Variables can be thought of as ‘‘fields’’ of data, or individual pieces of data;

typically, they are the columns on a spreadsheet. All of those columns—age;

gender; years of service; education; performance score A, B, or C; and so on—

are the variables. Why call them variables? Because they are not constants—the

data for each of these variables vary for each case (think of a case as a person).

Breaking the mass of data down into variables is a first step in getting a handle on

the information that likely sits before you.

Categorical Vs Continuous variables 73

Variables come in different types. The simplest way to break them down is to

determine whether they are categorical or continuous. As the name implies,

categorical variables are made up of types, classes, or categories. Think about

the variable ‘‘gender.’’ It has two categories—male and female.

Continuous variables, on the other hand, are numbers or numerical. Anything you

would report as a number is a continuous variable. Years of service could be one

continuous variable; numeric age would be another.

Dependent & Independent variables 74
Dependent & Independent variables 75

Dependent variable (Y) Independent Variable (X)

Company performance X1. Top management support
X2. Cross functional team work
X3. NPD process
X4. NPD strategies
X5. Market research activities

Dependent variable (Y) Independent Variable (X)

Sales X1. Brand perception
X2. Promotional activities
X3. Competition
X4. Price
X5. Quality
X6. Salesperson competency
How Variables relate to each other? 76

Well, that’s the central question. If you can figure out how variables

relate to each other, you can gain greater understanding of the

way things in the system under examination work. There are two

major ways to think about how variables relate to each other:

interdependently and dependently.

Measures of Relationship 77

◼ Measures of relationship
◼ Describes the relationship of the data

◼ Karl Pearson’s coefficient of correlation

◼ Spearman’s Rank order correlation

Measures of Relationship 78

Training Productivity (units /

Year Program
Mandays labour hr)
2010 - 0 19
2011 Awareness training 50 19
2012 LPS training to first batch 100 20
2013 LPS training to second batch 100 21
2014 LPS training to third batch 150 22
2015 LPS training to fourth batch 150 23
2016 Advanced program 200 26
2017 Advanced program 200 28
2018 Coaching, handholding & Refresher progam 250 31
2019 Coaching, handholding & Refresher progam 250 35
Measures of Relationship 79

Karl Pearson’s Correlation coefficient

X Y x=X-X y =Y-Y x2 y2 xy
xy
r= 0 20
x2 y2
50 20
100 21
X: Training man-days on 150 22
lean production system 150 23
150 24
Y: Productivity in units / 200 26

labour hour 200 28

250 31
250 35

x2 = y2 = xy =

Measures of Relationship 80

X Y x=X-X y =Y-Y x2 y2 xy
3250
0 20 -150 -5 22500 25 750 r=
60000 * 226
50 20 -100 -5 10000 25 500
100 21 -50 -4 2500 16 200 3250
150 22 0 -3 0 9 0 r=
13560000
150 23 0 -2 0 4 0
150 24 0 -1 0 1 0 3250
r=
200 26 50 1 2500 1 50 3682.39
200 28 50 3 2500 9 150
250 31 100 6 10000 36 600 r = 0.88
250 35 100 10 10000 100 1000

X bar =150 Y bar =25 x2 = 60000 y2 = 226 xy = 3250
Measures of Relationship 81

Training Vs Productivity (units / labour hr)

35
Productivity units / labour hour

0
0 50 100 150 200 250 300
Measures of Relationship 82

Correlation

◼ Correlation measures how closely two variables are related.

◼ Correlation coefficients vary from +1 to -1

◼ A value close to +1 indicates that a high value in one variable

will be reflected by a high value in other

◼ A value close to -1 indicates that a high value in one variable

will be reflected by a low value in other

◼ Near Zero indicates no correlation

Measures of Relationship 83

Types of simple correlation

1. Perfect positive correlation (r= +1.00)

2. High degree of positive correlation (r=+0.85)

3. Low degree of positive correlation (r=+0.35)

4. Perfect negative correlation (r=-1.00)

5. High degree of negative correlation (r=-0.85)

6. Low degree of negative correlation (r=-0.35)

7. Zero correlation (r= 0)

Measures of Relationship 84
Measures of Relationship 85

Scatter diagram
Measures of Relationship 86

Scatter diagram
Correlation 87

▪ Correlating market data and business data is definitely a step in the right
direction. It shows the organization that we are pulling information together
and making important connections.
▪ Correlations are used to understand how data sets are related. In other
words of variable “X” changes does variable “Y” change
Correlation 88
89

Predictive Analytics
What can Predictive Analytics do in Business? 90

◼ Predictive modelling in Business focuses mostly on finding predictive patterns of

Sales revenue, Forecasting sales, and workforce planning

◼ Forward looking – It combines algorithms, historical information and data mining

to solve problems, realize an outcome or answer a question. For example
◼ How likely a customer will stay with the business

◼ What mixture of skills, experience, and competencies would most likely guarantee a
high performance.

◼ With this information, analysis can be applied to predict how successful different
courses of action will be
Simple linear Regression 91

◼ Simple linear regression involves one independent variable and

one dependent variable.

◼ The relationship between the two variables is approximated by a

straight line.

◼ Regression analysis involving two or more independent variables

is called multiple regression.
Simple linear Regression model 92

◼ The equation that describes how y is related to x and an error

term is called the regression model.

◼ The simple linear regression model is:

y = 0 + 1x +

where:

b0 and b1 are called parameters of the model, e is a random

variable called the error term.
Simple linear Regression Equation 93

Positive Linear Relationship

E(y)

Regression line

Intercept Slope 1
0 is positive

x
Simple linear Regression Equation 94

Negative Linear Relationship

E(y)

Intercept
0 Regression line

Slope 1
is negative

x
Simple linear Regression Equation 95

No Relationship

E(y)

Intercept Regression line

0
Slope 1
is 0

x
Established Simple linear Regression Equation 96

The estimated simple linear regression equation

ŷ = b0 + b1 x

• The graph is called the estimated regression line.

• b0 is the y intercept of the line.
• b1 is the slope of the line.
• ŷ is the estimated value of y for a given x value.
Least Squares method 97

 Slope for the Estimated Regression Equation

b1 =  ( x − x )( y − y )
i i

 (x − x )
i
2

where:
xi = value of independent variable for ith observation
yi = value of dependent variable for ith observation
_
x = mean value for independent variable
_
y = mean value for dependent variable
Least Squares method 98

y-Intercept for the Estimated Regression Equation

b0 = y − b1 x
What is Linear Regression 99
What is Linear Regression 100
What is Logistic Regression 101
What is Logistic Regression 102
Contact 103

Phone : 9600066166

Web : www.transbizconsulting.com

Email : [email protected] /
[email protected]

Twitter : Transbiz1

Linked in : linkedin.com/company/transbizconsulting

Facebook : facebook.com/transbizconsulting
Thank You

UMT Student Portal Transcript
100% (1)
UMT Student Portal Transcript
1 page
PROJECT PROPOSAL For DSWD
67% (6)
PROJECT PROPOSAL For DSWD
11 pages
Concepts and Aims of National Philosophy of Education
No ratings yet
Concepts and Aims of National Philosophy of Education
13 pages
Statistics
No ratings yet
Statistics
52 pages
Source of Data
No ratings yet
Source of Data
32 pages
Stats Notes
No ratings yet
Stats Notes
7 pages
Lecture Notes - Prob and Stat
No ratings yet
Lecture Notes - Prob and Stat
229 pages
Statistical Concepts Reviewer
No ratings yet
Statistical Concepts Reviewer
11 pages
Introduction To Business Statistics: Data, Types of Variables, Levels of Measurement, Data Sources, Types of Statistics
No ratings yet
Introduction To Business Statistics: Data, Types of Variables, Levels of Measurement, Data Sources, Types of Statistics
16 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
37 pages
Introduction to Statistics..Final
No ratings yet
Introduction to Statistics..Final
221 pages
Stat 2 PDF
No ratings yet
Stat 2 PDF
41 pages
Chapter 1 INTRODUCTION TO STATISTICS (New)
No ratings yet
Chapter 1 INTRODUCTION TO STATISTICS (New)
34 pages
Statistics and Data: April Andrea M.Valera 2 0 1 8
No ratings yet
Statistics and Data: April Andrea M.Valera 2 0 1 8
34 pages
Chapter 2 Sources of Data
No ratings yet
Chapter 2 Sources of Data
34 pages
Bio Statistics
No ratings yet
Bio Statistics
164 pages
Stats Week 1 - Notes
No ratings yet
Stats Week 1 - Notes
7 pages
Introduction to Biostatistics
No ratings yet
Introduction to Biostatistics
67 pages
Statistics Lec 1
No ratings yet
Statistics Lec 1
28 pages
Lecture 1B Data and Presenting Information Part 1
No ratings yet
Lecture 1B Data and Presenting Information Part 1
16 pages
m103 Presentationunit 1-2
No ratings yet
m103 Presentationunit 1-2
25 pages
APPLIED STATISTICS FOR BUSINESS AND ECONOMICS Midterms Reviewer
No ratings yet
APPLIED STATISTICS FOR BUSINESS AND ECONOMICS Midterms Reviewer
23 pages
Basic Business Statistics: Introduction and Data Collection
No ratings yet
Basic Business Statistics: Introduction and Data Collection
33 pages
2a. Sources of Data
No ratings yet
2a. Sources of Data
27 pages
DMBA103
No ratings yet
DMBA103
9 pages
MATH30-6-Lecture-1-1
No ratings yet
MATH30-6-Lecture-1-1
32 pages
UNIT III
100% (1)
UNIT III
36 pages
statistics - Unit1 pdf
No ratings yet
statistics - Unit1 pdf
94 pages
Stat intro 01 june 2020
No ratings yet
Stat intro 01 june 2020
17 pages
DAT100_Int_Data_Ana_Lec4_Obtaining_Data
No ratings yet
DAT100_Int_Data_Ana_Lec4_Obtaining_Data
30 pages
COMM 215 Notes
No ratings yet
COMM 215 Notes
42 pages
Data, Data Collection, and Sourcing
No ratings yet
Data, Data Collection, and Sourcing
54 pages
Icte Lesson
No ratings yet
Icte Lesson
19 pages
1 - Business Statistics
No ratings yet
1 - Business Statistics
82 pages
MMW Reviewer
No ratings yet
MMW Reviewer
9 pages
6 Sampling and Basic Descriptive Statistics
No ratings yet
6 Sampling and Basic Descriptive Statistics
38 pages
Unit 1,2 Introduction N Summarization
No ratings yet
Unit 1,2 Introduction N Summarization
49 pages
Descriptive
No ratings yet
Descriptive
7 pages
مبادئ الاحصاء
No ratings yet
مبادئ الاحصاء
66 pages
Maths Statistics
No ratings yet
Maths Statistics
132 pages
Revision SB Chap 2 7
No ratings yet
Revision SB Chap 2 7
55 pages
Introduction Book 1
No ratings yet
Introduction Book 1
41 pages
Mth143: Business Statistics Lesson One: Introduction and Data Collection
No ratings yet
Mth143: Business Statistics Lesson One: Introduction and Data Collection
3 pages
Sources of Data_F2
No ratings yet
Sources of Data_F2
27 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
54 pages
Department of Mathematics: Business Statistics 1
No ratings yet
Department of Mathematics: Business Statistics 1
132 pages
ARME - Topic6 - Sampling Design and Procedures - MSK
No ratings yet
ARME - Topic6 - Sampling Design and Procedures - MSK
41 pages
Chapter 1
No ratings yet
Chapter 1
4 pages
CH4 Data Collection
No ratings yet
CH4 Data Collection
30 pages
week 5
No ratings yet
week 5
20 pages
Sample Designs and Sampling Procedures
No ratings yet
Sample Designs and Sampling Procedures
35 pages
FIN10002 - Notes Master
No ratings yet
FIN10002 - Notes Master
44 pages
Intro To Statistics
No ratings yet
Intro To Statistics
37 pages
Topic 1 Introduction To Statistics
No ratings yet
Topic 1 Introduction To Statistics
35 pages
Business Statistics May Module
No ratings yet
Business Statistics May Module
72 pages
STA 111 CHAPTER 1
No ratings yet
STA 111 CHAPTER 1
27 pages
Definition of Statistics: Individuals Who Shaped Statistics Today
No ratings yet
Definition of Statistics: Individuals Who Shaped Statistics Today
12 pages
Statistics Glossary
No ratings yet
Statistics Glossary
7 pages
3.Badm - Mba Notes
No ratings yet
3.Badm - Mba Notes
13 pages
Unit 1
No ratings yet
Unit 1
47 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Descriptive Statistics: Six Sigma Thinking, #3
From Everand
Descriptive Statistics: Six Sigma Thinking, #3
Sumeet Savant
No ratings yet
The Social Studies Curriculum, 1
No ratings yet
The Social Studies Curriculum, 1
21 pages
A Summer Training Project Report On: Iftm University, Moradabad
No ratings yet
A Summer Training Project Report On: Iftm University, Moradabad
7 pages
DLL - Mathematics 5 - Q3 - W2
No ratings yet
DLL - Mathematics 5 - Q3 - W2
10 pages
Shelby Wood 2020 Updated Resume
No ratings yet
Shelby Wood 2020 Updated Resume
2 pages
MODULE 7 SYNTAX Plus Worksheets
100% (1)
MODULE 7 SYNTAX Plus Worksheets
13 pages
TCS iON Case Brief
No ratings yet
TCS iON Case Brief
9 pages
Case Study On Hextable Dance Project
No ratings yet
Case Study On Hextable Dance Project
4 pages
TR - Plumbing NC II
50% (2)
TR - Plumbing NC II
89 pages
Clean and Maintain Kitchen Premises Cleaning and Maintaining Kitchen Premises
No ratings yet
Clean and Maintain Kitchen Premises Cleaning and Maintaining Kitchen Premises
4 pages
Course Outline PM
No ratings yet
Course Outline PM
7 pages
Exercise - Analytical Exposition Text
40% (5)
Exercise - Analytical Exposition Text
3 pages
BPCC108 English July 23 - Jan 24
No ratings yet
BPCC108 English July 23 - Jan 24
7 pages
Q3.week 3-4
No ratings yet
Q3.week 3-4
3 pages
Intro-to-Philo12 Q2 W5 Authentic-DialogueFINAL
100% (2)
Intro-to-Philo12 Q2 W5 Authentic-DialogueFINAL
23 pages
1.9 Context Is Everything Self-Guided Lesson Presentation
No ratings yet
1.9 Context Is Everything Self-Guided Lesson Presentation
11 pages
BS2093 Presentations FAQs & Advice
No ratings yet
BS2093 Presentations FAQs & Advice
2 pages
2024DISTRICT-SCHOOLS-PRESS-CONFERENCE-finalsigned
No ratings yet
2024DISTRICT-SCHOOLS-PRESS-CONFERENCE-finalsigned
4 pages
Information Leaflet 2010-11 Final
100% (1)
Information Leaflet 2010-11 Final
4 pages
Grade 1 Compreh - CSA - Gr1 - 2E - BK - LoRes
100% (1)
Grade 1 Compreh - CSA - Gr1 - 2E - BK - LoRes
88 pages
Assignment 1 Questions 2020
No ratings yet
Assignment 1 Questions 2020
3 pages
Social Responsibilities and Ethics in Management
No ratings yet
Social Responsibilities and Ethics in Management
3 pages
Cultural Models in Anthropology PDF
No ratings yet
Cultural Models in Anthropology PDF
14 pages
Ahmad Irsad Harahap
No ratings yet
Ahmad Irsad Harahap
1 page
Allan Luke 1995
No ratings yet
Allan Luke 1995
159 pages
Advanced Concrete Technology: 2 Year, Part Time Distance Learning Course
No ratings yet
Advanced Concrete Technology: 2 Year, Part Time Distance Learning Course
2 pages
Quarter 1 Week 5
No ratings yet
Quarter 1 Week 5
4 pages
The Grapevine, November 6, 2013
No ratings yet
The Grapevine, November 6, 2013
36 pages