0% found this document useful (0 votes)

28 views13 pages

Dsbda Unit2

The document explains key statistical concepts, including hypothesis testing, degree of freedom, skewness, kurtosis, and the Chi-square Goodness of Fit test. It outlines the steps for hypothesis testing, the significance of skewness and kurtosis in data analysis, and provides definitions and examples for various statistical measures and tests. Additionally, it differentiates between population and sample, and describes one-tailed and two-tailed t-tests with examples.

Uploaded by

surajjadhav3600

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views13 pages

Dsbda Unit2

Uploaded by

surajjadhav3600

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Q 1 ) How hypothesis testing works? Explain steps ?

Hypothesis testing is a statistical method to determine whether there is

enough evidence in a sample to infer that certain condition is true for the
entire population.
Hypothesis testing is a way to check if a claim about data is true using statistics.
It helps decide whether to accept or reject an assumption based on sample
data.
Steps of Hypothesis Testing:
1. State the Hypothesis
o Null Hypothesis (H₀): This is the default assumption .
o Alternative Hypothesis (H₁ or Ha): This is what you want to prove .
2. Set a Significance Level (α)
o This is the probability of rejecting H₀ when it is actually true.
o Common values are 0.05 (5%) or 0.01 (1%).
3. Collect & Analyze Data
o Gather sample data from experiments or observations and perform
the analysis on it.
4. Calculate the Test Statistic.
o Choose a statistical test (e.g., t-test, chi-square test) depending on
data type.
o The test statistic measures how much the sample data differs from
H₀.
5. Make a Decision & p-value
o The p-value tells how likely you’d get the observed result if H₀ were
true.
o If p-value ≤ α, reject H₀ (supports H₁).
o If p-value > α, fail to reject H₀ (not enough evidence for H₁).
6. Draw a Conclusion
o Based on the decision, conclude whether the claim is statistically
significant.
Example:
Imagine a company claims their new battery lasts 10 hours on average. A test
on a sample shows an average of 9.5 hours with a low p-value. If p-value <
0.05, we reject H₀ and say the battery lasts less than 10 hours.
Q 2 )What is Degree of Freedom? Explain with Example
1. Degree of Freedom (DOF) refers to the number of independent values or
variables in a statistical calculation that can vary while estimating a
parameter.
2. Degree of Freedom (DOF) is the number of values in a calculation that are
free to vary.
3. In simple terms, it represents the number of choices or free variables
available after certain constraints are applied.
4. It is widely used in hypothesis testing, especially in Chi-square test and
Regression analysis.
Formula for Degree of Freedom:
DOF=n−k
Where:
 n = Total number of observations (sample size)
 k = Number of estimated parameters or constraints
Example:

 Imagine you have 5 numbers, and their average is 10.

Let's say four of the numbers are 8, 9, 11, and 12.
 To keep the average 10, the fifth number must be 10 (since
(8+9+11+12+X)/5=10).
 Here, you were free to choose 4 numbers, but the 5th number was fixed
based on the average.
 So, the degree of freedom = Total numbers - 1 = 5 - 1 = 4.

Q 3 ) Explain skewness and kurtosis. What is the purpose of finding

skewness of data?

What is Skewness?

Skewness tells us how data is distributed—whether it is symmetrical or leaning

more to one side . Skewness tells us if the data is tilted to one side.

 If skewness = 0, the data is perfectly symmetrical (like a bell curve).

 If skewness > 0 (positive skew), the data has a long tail on the right (higher
values are more spread out).
 If skewness < 0 (negative skew), the data has a long tail on the left (lower
values are more spread out).
Example of Skewness:

 If most students score 70-80 in an exam but a few score 100+, the data is
positively skewed.
 If most students score 50-60, but a few score below 30, the data is
negatively skewed.

What is Kurtosis?

Kurtosis measures how "sharp" or "flat" the peak of the data distribution is
compared to a normal distribution. Kurtosis tells us if the data has a sharp or
flat peak.

 High kurtosis (>3): Tall and sharp peak (many values are close to the mean,
with few extreme values).
 Low kurtosis (<3): Flatter peak (more spread out, with fewer extreme
values).
 Normal kurtosis (=3): Similar to a normal bell curve.

Example of Kurtosis:

 A class where most students score around 75 with very few extreme scores
has high kurtosis (sharp peak).
 A class where scores are widely spread out with no clear peak has low
kurtosis (flat).

Why Find Skewness? (Purpose)

1. Understand Data Distribution – Helps to see if data is balanced or has

extreme values.
2. Make Better Predictions – Helps in statistical models and machine
learning.
3. Choose the Right Test – Some statistical tests require normally distributed
data.
4. Detect Bias in Data – Helps in finance, business, and research to avoid
misleading conclusions.
Q 4) Describe Chi-square Goodness of Fit test.

The Chi-Square Goodness of Fit test checks whether observed data matches
what we expected based on some assumption.

It is a statistical test used to determine whether a sample data set fits a specific
theoretical distribution.

It helps in analyzing differences in categorical data frequencies.

Formula for Chi-Square Goodness of Fit Test:

Where:

 X = Chi-square test statistic

 O = Observed frequency
 E = Expected frequency
 ∑= Summation

Steps for Chi-Square Goodness of Fit Test

1. Define Hypothesis
o Null Hypothesis (Ho): The observed data follows the expected
distribution.
o Alternative Hypothesis (H1): The observed data does not follow the
expected distribution.
2. Collect and Organize Data
o Identify observed (O) and expected (E) frequencies for each category.
3. Apply the Chi-Square Formula
o Use the formula:
o Calculate the Chi-square test statistic.
4. Find the Critical Value
o Use the Chi-square table based on the significance level (α) and
degrees of freedom (df=n−1).
5. Compare & Make Decision
o If X2 is greater than the critical value → Reject Ho (Data does not fit
the distribution).
o If X2 is less than the critical value → Fail to Reject Ho (Data fits the
expected distribution).
Q 5 ) List out measures of dispersion with their significance and mathematical
formulae.
1. Absolute Measure of Dispersion
 These are expressed in the same unit as data.
(i) Range (R)
 Definition: Difference between maximum value and minimum value in
the dataset.
 Formula: R=Xmax−Xmin
 Significance:
1. Simple and easy to calculate.
2. Does not consider data distribution.
(ii) Mean Deviation (MD)
 Definition: The average of the absolute deviation from the central value.

 Formula:
 Where:
o Xi = individual observations
o M = mean or median
o N = total number of observations
 Significance:
1. More useful than range as it considers every data point.
2. Uses absolute differences to avoid negative values canceling out.
(iii) Variance (σ2\sigma^2σ2)
 Definition: Average of squared deviations from the mean.
 Significance:
o Provides a measure of spread around the mean.
o Squaring avoids negative deviations canceling positive ones.
(iv) Standard Deviation (σ)
 Definition: The square root of variance.

 Formula:
 Significance:
o Most commonly used measure of dispersion.
o Expressed in the same units as the data.
Coefficient of Variation (CV)
 Definition: Measures relative variability as a percentage.

 Formula:
 Significance:
o Useful for comparing datasets with different units.
Q 6) Write a short note on contingency table, explain with example.

Structure of a Contingency Table

A contingency table consists of:
 Rows: Represent one categorical variable.
 Columns: Represent another categorical variable.
 Cells: Show the frequency or count of occurrences for the combinations of
both variables.
This table allows researchers to calculate probabilities, relationships, and
dependency between variables using statistical methods like Chi-square tests.
Q 7 ) With an example explain Baye's theorem. Also explain its key terms.

Key Terms in Bayes' Theorem.

1. Prior Probability (P(B)) – The initial probability of an event before
considering new evidence.
2. Likelihood (P(A∣B)) – The probability of obtaining evidence given that an
event has occurred.
3. Posterior Probability (P(B∣A)) – The updated probability of an event after
considering new evidence.
4. Evidence (P(A)) – The total probability of the evidence occurring under all
possible conditions.

Q 8 ) What is population & how is it differ from a sample?

1. Population:
 A population is the entire group of individuals, items, or data that you
want to study.
 It includes every possible member related to your study.
 Example: If you want to study the height of students in a school, then all
students in the school form the population.
2. Sample:
 A sample is a smaller group selected from the population.
 It is used to study and make conclusions about the entire population
(because studying the whole population is difficult).
 Example: Instead of measuring all students in the school, you select 50
students randomly. This group of 50 students is a sample.
Key Differences:
Feature Population Sample
The entire group being
Definition A subset of the population
studied
Size Large (can be millions) Smaller, manageable
Example All students in a school 50 students chosen from the school
Used to estimate population
Use Used for complete analysis
characteristics
Cost & Expensive & time-
Cheaper & quicker
Time consuming

Q 9) With an example, explain one-tailed & two-tailed t-tests.

1. One-Tailed t-Test
 A one-tailed t-test is used when we want to check if one group’s mean is
either greater or smaller than the other.
 It only tests in one direction (greater or smaller).

Example:
A company claims that a new training program increases employee
productivity. You perform a one-tailed t-test with:
 Null Hypothesis (H₀): The training does not increase productivity.
 Alternative Hypothesis (H₁): The training increases productivity.
If the test result is significant, it means the training increased productivity. A
one-tailed test is used because we are only checking for an increase, not a
decrease.
2. Two-Tailed t-Test
 A two-tailed t-test checks both directions—whether one group is greater or
smaller than the other.
 It does not assume which direction the difference might be.
Example:
A teacher wants to know if boys and girls score differently in a math test. You
perform a two-tailed t-test with:
 Null Hypothesis (H₀): Boys and girls score the same on average.
 Alternative Hypothesis (H₁): Boys and girls do not score the same (one group
may be higher or lower).
Since the test checks both higher and lower scores, a two-tailed test is used.

CRE Equations and Formulas Print Out
No ratings yet
CRE Equations and Formulas Print Out
30 pages
66608830b44dd3f1fb8c5c43 - 2024 JUNE LDE - Final
No ratings yet
66608830b44dd3f1fb8c5c43 - 2024 JUNE LDE - Final
36 pages
Lesson 2.2 - Terms Related To Polynomials
No ratings yet
Lesson 2.2 - Terms Related To Polynomials
24 pages
OPMC001 - Business Statistics - Both Assignment
No ratings yet
OPMC001 - Business Statistics - Both Assignment
189 pages
Eco254 Summary (Full) 08024665051
No ratings yet
Eco254 Summary (Full) 08024665051
12 pages
Blue Print - Xi - Biology - See - 2023-24
100% (1)
Blue Print - Xi - Biology - See - 2023-24
3 pages
đề bồi dưỡng HSG Anh 8
No ratings yet
đề bồi dưỡng HSG Anh 8
24 pages
DCElectrical Circuit Analysis
No ratings yet
DCElectrical Circuit Analysis
374 pages
SearchOok Results For Harlan Ellison
0% (1)
SearchOok Results For Harlan Ellison
14 pages
Mathematics 1
100% (1)
Mathematics 1
17 pages
BRM Answer Key Q Bank by Alam.
No ratings yet
BRM Answer Key Q Bank by Alam.
90 pages
Pcoa 009 - Management Science Module 4: Waiting Line Models
No ratings yet
Pcoa 009 - Management Science Module 4: Waiting Line Models
18 pages
Lecture Note On Biostatistics
No ratings yet
Lecture Note On Biostatistics
74 pages
Physics Work Power Energy Questions Class 11
No ratings yet
Physics Work Power Energy Questions Class 11
3 pages
Stats Notes
No ratings yet
Stats Notes
76 pages
Statistics През
No ratings yet
Statistics През
46 pages
Lecture 6 Hypothesis Testing and Z Test
No ratings yet
Lecture 6 Hypothesis Testing and Z Test
47 pages
Strategic Business Leader
No ratings yet
Strategic Business Leader
130 pages
Profile of The Filipino Farmer (Group 2)
No ratings yet
Profile of The Filipino Farmer (Group 2)
64 pages
Physics of Language Models Part 1 Learning Hierarchical Language Structures
No ratings yet
Physics of Language Models Part 1 Learning Hierarchical Language Structures
45 pages
20ma402 Ps Unit III DCM
No ratings yet
20ma402 Ps Unit III DCM
77 pages
5 - Stat Lecture..
No ratings yet
5 - Stat Lecture..
44 pages
Quantitative Methods For Management: Session - 10
No ratings yet
Quantitative Methods For Management: Session - 10
95 pages
Anyons in An Exactly Solved Model and Beyond: Alexei Kitaev
No ratings yet
Anyons in An Exactly Solved Model and Beyond: Alexei Kitaev
113 pages
BRM Unit-4
No ratings yet
BRM Unit-4
47 pages
ADS QB Num+Theory Soln
No ratings yet
ADS QB Num+Theory Soln
37 pages
BA4101 - Statistics - For - Management - Revised
No ratings yet
BA4101 - Statistics - For - Management - Revised
21 pages
Statistics
No ratings yet
Statistics
28 pages
MM II - 61 - Session 4 - Quantitative Research-II
No ratings yet
MM II - 61 - Session 4 - Quantitative Research-II
34 pages
Molykote 103brochure
No ratings yet
Molykote 103brochure
94 pages
GD40PIT120C5S
No ratings yet
GD40PIT120C5S
13 pages
Chemistry IA
No ratings yet
Chemistry IA
17 pages
Study Notes On Estimation
No ratings yet
Study Notes On Estimation
17 pages
Stats
No ratings yet
Stats
52 pages
Glossary: Lean Six Sigma Green & Black Belt
No ratings yet
Glossary: Lean Six Sigma Green & Black Belt
39 pages
Risk Management Analysis in Scrum Software Projects
No ratings yet
Risk Management Analysis in Scrum Software Projects
23 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
13 pages
Ch6 - Hypothesis Testing
No ratings yet
Ch6 - Hypothesis Testing
31 pages
Business Research CH-6
No ratings yet
Business Research CH-6
28 pages
COE0001 Lecture8students
No ratings yet
COE0001 Lecture8students
28 pages
PSAI Unit 5
No ratings yet
PSAI Unit 5
25 pages
Bio Statistics
No ratings yet
Bio Statistics
97 pages
State 2205
No ratings yet
State 2205
19 pages
Statistic Module 2
No ratings yet
Statistic Module 2
15 pages
Theoretical Questions in Basic Business Statistics
No ratings yet
Theoretical Questions in Basic Business Statistics
12 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
29 pages
CRP Phase 4-Analyzing and Interpreting Quantitative Data
No ratings yet
CRP Phase 4-Analyzing and Interpreting Quantitative Data
24 pages
The Art of Public Speaking
No ratings yet
The Art of Public Speaking
4 pages
Business Statistics
No ratings yet
Business Statistics
28 pages
Statistics Practise Questions
No ratings yet
Statistics Practise Questions
19 pages
CUF-LS in Eng-G4-W1
No ratings yet
CUF-LS in Eng-G4-W1
15 pages
Lecture 7
No ratings yet
Lecture 7
20 pages
Chi Square Test
No ratings yet
Chi Square Test
32 pages
Chi Square Test For A Variance
No ratings yet
Chi Square Test For A Variance
10 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Statistics - Exam Reviewer (Final)
No ratings yet
Statistics - Exam Reviewer (Final)
10 pages
Statistics 1 (Final) / Orthodontic Courses by Indian Dental Academy
No ratings yet
Statistics 1 (Final) / Orthodontic Courses by Indian Dental Academy
15 pages
Stat Prob
No ratings yet
Stat Prob
7 pages
Pre FinalExam Reviewer
No ratings yet
Pre FinalExam Reviewer
4 pages
Data Science Interview Preparation (30 Days of Interview Preparation)
No ratings yet
Data Science Interview Preparation (30 Days of Interview Preparation)
27 pages
1.1.5 Explore - Sleep Deprivation Experiment (Exploration)
No ratings yet
1.1.5 Explore - Sleep Deprivation Experiment (Exploration)
4 pages
Engineering Idea Statistic
No ratings yet
Engineering Idea Statistic
7 pages
Dataset Types
No ratings yet
Dataset Types
2 pages
12jane Chi-Square FinalA4
No ratings yet
12jane Chi-Square FinalA4
10 pages
Statistics For Data Analytics
No ratings yet
Statistics For Data Analytics
15 pages
Calculate Mean, Median, Mode, Variance and Standard Deviation For Column A
No ratings yet
Calculate Mean, Median, Mode, Variance and Standard Deviation For Column A
22 pages
Commuction For Engeneeriners Important Questions
No ratings yet
Commuction For Engeneeriners Important Questions
7 pages
Ads Exp1
No ratings yet
Ads Exp1
4 pages
Mba Semester 1 Mb0040 - Statistics For Management-4 Credits (Book ID: B1129) Assignment Set - 1 (60 Marks)
No ratings yet
Mba Semester 1 Mb0040 - Statistics For Management-4 Credits (Book ID: B1129) Assignment Set - 1 (60 Marks)
10 pages
SSSE
No ratings yet
SSSE
5 pages
Model Paper STAT-211
No ratings yet
Model Paper STAT-211
7 pages
Astronomy For Kids - Constellations
No ratings yet
Astronomy For Kids - Constellations
6 pages
Mba Semester 1 Mb0040 - Statistics For Management-4 Credits (Book ID: B1129) Assignment Set - 2 (60 Marks)
No ratings yet
Mba Semester 1 Mb0040 - Statistics For Management-4 Credits (Book ID: B1129) Assignment Set - 2 (60 Marks)
5 pages
Statistical Formula Sheet 1: X X N X N X F X N
No ratings yet
Statistical Formula Sheet 1: X X N X N X F X N
11 pages
Statistics - 3rd Grading
No ratings yet
Statistics - 3rd Grading
3 pages
Mah Apps
No ratings yet
Mah Apps
2 pages
Bot Senior One Maths Term Two 2024
No ratings yet
Bot Senior One Maths Term Two 2024
3 pages
Statistics For Management Unit 3 2marks
No ratings yet
Statistics For Management Unit 3 2marks
4 pages
Chapter 1 Review of Elementary Statistics
No ratings yet
Chapter 1 Review of Elementary Statistics
5 pages
MB0040 MQP Answer Keys
No ratings yet
MB0040 MQP Answer Keys
19 pages
Applied Math Unit1 Summary and Useful Formulas
100% (1)
Applied Math Unit1 Summary and Useful Formulas
4 pages
NI Film Showing Guide Questions 1
No ratings yet
NI Film Showing Guide Questions 1
2 pages
FICHA TECNICA INCUBADORA Nuve-Catalogue-EN055-120EN
No ratings yet
FICHA TECNICA INCUBADORA Nuve-Catalogue-EN055-120EN
2 pages
Basic Statistics Formula Sheet
No ratings yet
Basic Statistics Formula Sheet
5 pages

Dsbda Unit2

Uploaded by

Dsbda Unit2

Uploaded by

Q 1 ) How hypothesis testing works? Explain steps ?

Hypothesis testing is a statistical method to determine whether there is

 Imagine you have 5 numbers, and their average is 10.

Q 3 ) Explain skewness and kurtosis. What is the purpose of finding

Skewness tells us how data is distributed—whether it is symmetrical or leaning

 If skewness = 0, the data is perfectly symmetrical (like a bell curve).

Why Find Skewness? (Purpose)

1. Understand Data Distribution – Helps to see if data is balanced or has

It helps in analyzing differences in categorical data frequencies.

Formula for Chi-Square Goodness of Fit Test:

 X = Chi-square test statistic

Steps for Chi-Square Goodness of Fit Test

Structure of a Contingency Table

Key Terms in Bayes' Theorem.

Q 8 ) What is population & how is it differ from a sample?

Q 9) With an example, explain one-tailed & two-tailed t-tests.

You might also like