MA262 23DIT010
MA262 Continues Internal Evaluation
Problem Statement:
This Practical aims to measure the strength and direction of the
relationship between two variables using correlation coefficients. The
implementation will include for measuring linear relationships and for assessing
monotonic relationships. By applying these methods, it will provide insights into
the degree of association between the selected variables, aiding in data-driven
decision-making.
The analysis will be conducted using at least , each representing real-world
scenarios where correlation analysis is beneficial.
• Variables: Study hours vs. Exam scores
• Purpose: Determine if more study hours lead to higher scores.
• Variables: Temperature vs. Electricity consumption
• Purpose: Evaluate how temperature fluctuations affect power usage.
Code:
import numpy as np
import pandas as pd
data = pd.DataFrame(np.random.randint(0,100 , (10,2)) , columns=['x','y'])
data
1
MA262 23DIT010
import matplotlib.pyplot as plt
plt.scatter(data['x'] , data['y'])
plt.show()
2
MA262 23DIT010
print("Karl Pearson’s Correlation Coefficient" , np.corrcoef(data['x'] , data['y'])[0,1])
from scipy.stats import spearmanr
correlation, p_value = spearmanr(data['x'], data['y'])
print('Spearman’s Rank Correlation Coefficient',correlation)
3
MA262 23DIT010
Code Analysis:
In this code analysis we can show if the generated numbers of the
two columns are matched in terms of sequence (like one is increased and
second is decreased or both are decreased or both are increased).
Here, Karl Pearson’s and Spearman’s Rank both the method is trying
to find this type of sequence and give the Coefficient between [-1,1]. In this
is Coefficient if Coefficient is -1 then both the columns are decreased
simultaneously if Coefficient is 1 then both the columns are increase
simultaneously.
This type of function is used for finding relation between two
columns. From this we can assume that in both columns any type of
relation is present or not present.
Use cases:
Let’s take one example of this. Now we have a dataset for the
student CGPA and Package in this not necessary that if student’s CGPA is
height than package is also height. To solve this problem, we get Karl
Pearson’s and Spearman’s Rank Correlation Coefficient. That shows the
relationship between SGPA and Package.
Like this way we can apply it on other examples like:
i. Student IQ vs Student CGPA.
ii. Laptop battery vs laptop performance.
iii. In City, pollution vs vehicles.
iv. Employee Experience vs Salary.
v. Temperature vs Ice Cream Sales.
vi. Age vs Blood Pressure.
4
MA262 23DIT010
vii. Rainfall vs Crop Yield.
viii. Social Media Usage vs Productivity of mind.
ix. Internet Speed vs Video Streaming Quality.
x. Airline Ticket Price vs Booking Time.
xi. Employee Training Hours vs Productivity.