0% found this document useful (0 votes)
72 views18 pages

BS - Module 2

Uploaded by

jits58523
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views18 pages

BS - Module 2

Uploaded by

jits58523
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Explanatory Note: Module 2 – Class 1

Topic: Correlation – Meaning, Significance & Types


Course: Business Statistics | BBA – 1st Semester

Introduction

In business and economics, we often observe that changes in one variable are related to changes in another. For example, when a company increases
its advertising budget, its sales may increase. Or, when the price of a product goes up, the demand may go down. Such relationships between two
variables can be studied using a statistical tool called correlation.

Understanding correlation helps business students and professionals analyze relationships between variables, make predictions, and take data-
driven decisions. In this class, we will explore the meaning, importance, and types of correlation in simple academic English.
1. Meaning of Correlation

The term correlation refers to a statistical relationship between two variables. It tells us whether two variables move together (in the same
direction), in opposite directions, or not at all.

More formally, correlation is a measure of the degree to which two variables are linearly related.

• If two variables change in the same direction, the correlation is positive.


• If two variables change in opposite directions, the correlation is negative.
• If there is no pattern in the relationship, the correlation is zero or absent.

Correlation does not imply causation. Just because two variables are correlated does not mean that one causes the other. For example, ice cream
sales and drowning incidents may both increase in summer, but ice cream does not cause drowning.

2. Significance of Correlation in Business

Understanding correlation is highly important in business statistics for several reasons:

a) Helps in Decision Making:

If a company knows that two variables are strongly related, it can plan strategies accordingly. For example, if sales are highly correlated with
advertising, increasing advertising may help boost sales.

b) Forecasting and Prediction:

Correlation helps in forecasting future values. For instance, if sales are closely correlated with population growth, businesses can predict future
sales based on population estimates.
c) Risk Management:

In finance, understanding the correlation between asset returns is crucial for diversification. If two stocks are negatively correlated, investing in
both can reduce overall risk.

d) Resource Allocation:

By identifying correlated factors, businesses can allocate resources more efficiently. For example, if employee training and productivity are
positively correlated, more budget can be allocated for training.

3. Types of Correlation

Correlation can be classified in several ways based on direction, degree, and number of variables involved.

A. Based on Direction:

i) Positive Correlation

When two variables move in the same direction, the correlation is positive. As one increases, the other also increases; or as one decreases, the
other also decreases.

• Example: Advertising expenditure and sales revenue


o As advertising increases, sales tend to increase.

ii) Negative Correlation

When two variables move in opposite directions, the correlation is negative. As one increases, the other decreases.

• Example: Price and demand


o As the price of a product increases, the quantity demanded usually decreases.
iii) Zero Correlation

When there is no relationship between two variables, the correlation is zero. The variables do not affect each other.

• Example: Shoe size and exam score


o No logical connection between the two.

B. Based on Degree (Strength of Correlation):

The degree of correlation refers to how strongly two variables are related. It is measured by a correlation coefficient, which ranges from –1 to
+1.

Correlation Coefficient (r) Degree of Correlation


+1 Perfect Positive Correlation
Between 0 and +1 Moderate/Weak Positive
0 No Correlation
Between 0 and –1 Moderate/Weak Negative
–1 Perfect Negative Correlation

• Perfect Correlation: One variable change exactly with the other.


• High Correlation: Strong but not perfect relationship.
• Low Correlation: Weak relationship.

C. Based on Number of Variables:

i) Simple Correlation

Involves the relationship between two variables.

• Example: Salary and years of experience


ii) Multiple Correlation

Involves the relationship among more than two variables.

• Example: Relationship between sales (dependent variable) and advertising + pricing (independent variables)

iii) Partial Correlation

Studies the relationship between two variables while keeping other variables constant.

• Example: Finding the correlation between study time and marks while controlling for sleep hours

4. Real-Life Business Example: Correlation in Retail Sales

Let’s consider a real-life example in the retail industry.

A retail store records data for one year to study the relationship between monthly advertising expenses (in ₹) and monthly sales revenue (in ₹).

Month Advertising (₹ in ‘000) Sales (₹ in ‘000)


Jan 10 120
Feb 20 180
Mar 30 250
Apr 25 220
May 15 150

Now, let’s observe the pattern:

• As advertising increases, sales also increase.


• When advertising decreases, sales tend to drop.
This suggests a positive correlation between advertising and sales. If we calculate the correlation coefficient using a statistical method like
Karl Pearson’s, it may be close to +1, indicating a strong positive linear relationship.

Business Insight:
The manager can confidently decide to increase advertising during festive months to boost sales, knowing that the two are positively correlated.

5. Important Notes

• Correlation tells us if and how strongly two variables are related.


• It does not explain why they are related.
• Correlation can be linear (straight-line relationship) or non-linear (curved relationship), but most common measures like Pearson’s
focus on linear correlation.
• A strong correlation doesn’t always imply a meaningful relationship. Sometimes, two variables can appear correlated due to coincidence
or a hidden third factor (this is called spurious correlation).

Conclusion

Correlation is a fundamental concept in statistics and business decision-making. It helps us understand the relationship between two or more
variables and is a stepping stone for more advanced analyses like regression and forecasting.

Knowing the type, direction, and degree of correlation allows business professionals to:

• Predict future outcomes


• Analyze performance
• Minimize risk
• Maximize returns
Explanatory Note: Module 2 – Class 2
Topic: Scatter Diagram
Course: Business Statistics | BBA – 1st Semester

Introduction

In business statistics, one of the easiest and most effective ways to understand the relationship between two variables is through a scatter diagram.
Also known as a scatter plot, it is a simple graphical tool that helps visualize whether and how two variables are related.

For business students, learning how to construct and interpret a scatter diagram is an essential step toward analyzing relationships between variables
such as price and demand, advertising and sales, or income and spending.

What is a Scatter Diagram?

A scatter diagram is a graph that shows the relationship between two quantitative variables by plotting data points on a two-dimensional graph.
Each point represents one observation in the dataset, with one variable on the x-axis and the other on the y-axis.

• The x-axis typically represents the independent variable (cause).


• The y-axis represents the dependent variable (effect or outcome).

By observing the pattern of points, we can get a sense of whether the two variables have a relationship and, if so, what kind.

Purpose of a Scatter Diagram

The main objective of a scatter diagram is to:

• Identify the direction of the relationship (positive, negative, or none)


• Observe the strength of the relationship
• Detect outliers or unusual data points
• Provide a visual foundation for further statistical analysis like correlation and regression
Types of Relationships in Scatter Diagrams

1. Positive Correlation:
If the data points tend to go upward from left to right, the variables are positively related.
o Example: Increase in advertising leads to an increase in sales.
2. Negative Correlation:
If the data points go downward from left to right, the variables are negatively related.
o Example: As price increases, quantity demanded decreases.
3. No Correlation:
If the points are scattered randomly with no clear direction, the variables have no relationship.
o Example: Shoe size and test scores.

Steps to Draw a Scatter Diagram

1. Collect pairs of data (two variables).


2. Draw the x-axis and y-axis and label them.
3. Plot each pair of values as a single point on the graph.
4. Observe the pattern of points.

Real-Life Example: Advertising and Sales

A retail company wants to study the effect of advertising on monthly sales. It collects the following data for 6 months:

Month Advertising (₹ in thousands) Sales (₹ in thousands)


Jan 10 100
Feb 15 130
Mar 20 150
Apr 25 170
May 30 200
Jun 35 230
To draw the scatter diagram:

• Plot Advertising on the x-axis.


• Plot Sales on the y-axis.
• Each pair of (Advertising, Sales) becomes one point on the graph.

Interpretation:
The points will appear to move upward from left to right, indicating a strong positive correlation. This suggests that higher advertising spending
is associated with higher sales.

Conclusion

A scatter diagram is a simple yet powerful tool in statistics. It allows us to visually explore and understand the relationship between two variables.
While it does not provide exact numerical results, it is an excellent starting point for further analysis using correlation coefficients or regression
models.

In business, scatter diagrams are often used to support decisions related to marketing, pricing, finance, and operations. By learning how to create
and interpret scatter diagrams, students gain valuable skills for data-driven decision-making in real-world business situations.
Explanatory Note: Module 2 – Class 3
Topic: Karl Pearson’s Coefficient of Correlation
Course: Business Statistics | BBA – 1st Semester

Introduction

In the previous class, we learned how a scatter diagram visually represents the relationship between two variables. While scatter plots show the
trend or pattern, they do not give us a precise numerical value to express the strength or direction of the relationship. This is where Karl Pearson’s
Coefficient of Correlation becomes useful.

Developed by the British mathematician Karl Pearson, this method gives a quantitative measure of how strongly two variables are related. It is
one of the most widely used techniques in business statistics.

Definition

Karl Pearson’s Coefficient of Correlation, denoted by r, measures the strength and direction of the linear relationship between two
continuous variables.

• The value of r always lies between –1 and +1.


o r = +1: Perfect positive correlation
o r = –1: Perfect negative correlation
o r = 0: No correlation

The closer the value of r is to +1 or –1, the stronger the relationship between the two variables.

Formula

The formula for Karl Pearson’s coefficient is:

r=∑(x−xˉ)(y−yˉ)∑(x−xˉ)2∑(y−yˉ)2r = \frac{\sum (x - \bar{x})(y - \bar{y})}{\sqrt{\sum (x - \bar{x})^2 \sum (y - \bar{y})^2}}


Where:

• xx and yy are the two variables


• xˉ\bar{x} is the mean of x
• yˉ\bar{y} is the mean of y
• ∑\sum represents summation over all data points

This formula can also be expressed in terms of covariance and standard deviations:

r=Cov(x,y)σxσyr = \frac{\text{Cov}(x, y)}{\sigma_x \sigma_y}

Steps to Calculate Karl Pearson’s r

1. Find the mean of x and y.


2. Calculate the deviations (x−xˉ)(x - \bar{x}) and (y−yˉ)(y - \bar{y}).
3. Multiply the deviations for each pair and sum them to get the numerator.
4. Calculate the squared deviations and take square roots for the denominator.
5. Use the formula to get the value of r.

Interpretation of r

Value of r Relationship Type


+0.91 to +1.00 Very strong positive
+0.70 to +0.90 Strong positive
+0.40 to +0.69 Moderate positive
+0.10 to +0.39 Weak positive
0 No correlation
–0.10 to –0.39 Weak negative
Value of r Relationship Type
–0.40 to –0.69 Moderate negative
–0.70 to –0.90 Strong negative
–0.91 to –1.00 Very strong negative

Real-Life Example: Advertising and Sales

Let’s say a company wants to analyze the relationship between its monthly advertising expenses and monthly sales. It collects the following
data:

Month Advertising (₹ in ‘000) Sales (₹ in ‘000)


Jan 10 100
Feb 15 130
Mar 20 160
Apr 25 190
May 30 220

Using the above data and the Pearson formula, we can calculate r ≈ +1, which indicates a very strong positive correlation.

Business Insight:

The company can confidently conclude that increasing advertising is associated with increased sales, and may decide to allocate more budget to
marketing in future months.

Limitations of Karl Pearson’s r

• Only measures linear relationships.


• Sensitive to outliers (extreme values can distort the result).
• Does not imply causation — correlation alone does not mean one variable causes the other.
Conclusion

Karl Pearson’s coefficient of correlation is a powerful tool to measure the direction and strength of a linear relationship between two variables.
It is widely used in business analysis, marketing research, financial forecasting, and operations planning.

Understanding and applying this concept enables students to make data-informed decisions in real-world business contexts. In the next class, we
will study another method of correlation analysis called Spearman’s Rank Correlation, which is suitable for ranked or ordinal data.
Explanatory Note: Module 2 – Class 4, Class 5
Topic: Spearman’s Rank Correlation
Course: Business Statistics | BBA – 1st Semester

Introduction

In the previous class, we studied Karl Pearson’s coefficient of correlation, which measures the strength and direction of a linear relationship
between two quantitative variables. However, in many business and social science situations, data may not be numerical or may not follow a
linear pattern. Instead, the data may be qualitative in nature, or involve rankings, such as customer satisfaction, preferences, or employee
performance.

To analyze relationships between such ranked or ordinal data, we use Spearman’s Rank Correlation, developed by British psychologist
Charles Spearman. It is a non-parametric (distribution-free) measure, meaning it does not assume any specific distribution of the data.

What is Spearman’s Rank Correlation?

Spearman’s Rank Correlation Coefficient, denoted by ρ (rho) or rₛ, is a statistical measure of the strength and direction of the monotonic
relationship between two ranked variables.

A monotonic relationship means that as the value of one variable increases, the value of the other variable also increases (or always decreases),
though not necessarily at a constant rate.

Key Features:

• Based on ranks, not actual values


• Suitable for ordinal or non-linear data
• Less affected by outliers than Pearson’s correlation
• Value of ρ lies between –1 and +1
Formula for Spearman’s Rank Correlation

If there are no tied ranks, the formula is:

ρ=1−6∑d2n(n2−1)\rho = 1 - \frac{6 \sum d^2}{n(n^2 - 1)}

Where:

• dd = difference between the ranks of each pair


• d2d^2 = square of the difference
• nn = number of observations (pairs of ranks)

If there are tied ranks, then average ranks are used, and a more complex formula is applied, but the basic idea remains the same.

Steps to Calculate Spearman’s Rank Correlation

1. Assign Ranks to the two sets of data. The highest value gets rank 1, the next highest gets rank 2, and so on.
2. Calculate the difference (d) between the ranks of each pair.
3. Square each difference to get d2d^2.
4. Sum all values of d2d^2.
5. Apply the formula to calculate ρ.

Interpretation of ρ

ρ Value Interpretation
+1 Perfect positive rank correlation
Between 0 and +1 Positive correlation (strong/weak)
0 No correlation
Between 0 and –1 Negative correlation (strong/weak)
–1 Perfect negative rank correlation
When to Use Spearman’s Rank Correlation

Use Spearman’s Rank Correlation when:

• Data is in ranked form (e.g., survey rankings, preference ratings).


• Variables are ordinal.
• The relationship is not linear.
• There are extreme values (outliers) in the dataset.
• You want a non-parametric alternative to Pearson’s method.

Real-Life Business Example: Customer Satisfaction & Loyalty

A company wants to study the relationship between customer satisfaction and brand loyalty. It surveys 8 customers and ranks their
satisfaction level (1 = highest) and loyalty level (1 = most loyal). The data is shown below:

Customer Satisfaction Rank (X) Loyalty Rank (Y) d = X – Y d²


A 1 2 –1 1
B 2 1 1 1
C 3 4 –1 1
D 4 3 1 1
E 5 5 0 0
F 6 6 0 0
G 7 8 –1 1
H 8 7 1 1
6

Now, we apply the formula:

ρ=1−6∑d2n(n2−1)=1−6×68(82−1)=1−368×63\rho = 1 - \frac{6 \sum d^2}{n(n^2 - 1)} = 1 - \frac{6 \times 6}{8(8^2 - 1)} = 1 - \frac{36}{8
\times 63} =1−36504=1−0.0714=0.9286= 1 - \frac{36}{504} = 1 - 0.0714 = 0.9286
Interpretation:

ρ ≈ +0.93, indicating a very strong positive rank correlation. This means that customers who are more satisfied tend to be more loyal to the
brand. The company can use this insight to prioritize customer satisfaction as a way to increase customer loyalty.

Advantages of Spearman’s Rank Correlation

• Easy to understand and calculate


• Works well with ordinal and ranked data
• Not affected by outliers or non-normal distributions
• Useful for non-linear but monotonic relationships
• Can be used with small sample sizes

Limitations

• Not suitable for interval or ratio scale data when the actual values matter
• Less precise than Pearson’s r for linear data
• Ranking can sometimes lead to loss of information, especially in large datasets
• Tied ranks can make the calculation slightly more complex

Applications in Business

Spearman’s rank correlation is widely used in various fields, especially where data is ranked or ordinal in nature:

Business Area Use of Rank Correlation


Marketing Relating customer satisfaction and loyalty
HR Management Comparing employee performance and supervisor ratings
Consumer Research Linking product features and customer preference ranks
Finance Studying relationship between analyst ratings and stock performance
Sales Analysis Ranking products by demand and supply issues
Conclusion

Spearman’s Rank Correlation is a simple yet powerful statistical tool to analyze the relationship between ranked variables. It is particularly
useful in business and social science contexts where qualitative judgments or preferences are ranked instead of being measured numerically.

You might also like