Notes
Notes
1. Hypothesis Testing
Hypothesis testing is a statistical method used to make decisions or inferences about a population
based on sample data. It involves the following key steps:
1. Formulating Hypotheses:
o Null Hypothesis (H0H_0H0): A statement of no effect or no difference. It
assumes the population parameter equals a specific value.
o Alternative Hypothesis (H1H_1H1): A statement contradicting the null
hypothesis. It represents what the researcher aims to prove.
2. Setting the Significance Level (α\alphaα):
o Commonly used values are 0.050.050.05 or 0.010.010.01, representing the
probability of rejecting H0H_0H0 when it is true.
3. Calculating the Test Statistic:
o Depends on the type of test and sample data (e.g., ttt-test, ZZZ-test).
4. Making a Decision:
o Compare the test statistic to a critical value or use the ppp-value approach to
accept or reject H0H_0H0.
c. FFF-Test
ZZZ-Test:
Where:
ttt-Test:
Where:
FFF-Test:
F=s12s22F = \frac{s_1^2}{s_2^2}F=s22s12
Where:
Chi-Square Test:
Where:
Unit 4
1. Theory of Probability
Basic Terms
1. Experiment: A procedure that produces outcomes (e.g., rolling a die).
2. Sample Space (SSS): The set of all possible outcomes.
3. Event (AAA): A subset of the sample space.
Where:
Axioms of Probability
Where:
Probability distributions describe how probabilities are distributed over values of a random
variable.
a. Binomial Distribution
Used when there are two possible outcomes (success or failure) in a fixed number of
independent trials.
Where:
b. Poisson Distribution
Used for modeling the number of events occurring in a fixed interval of time or space.
P(X=k)=e−λλkk!P(X = k) = \frac{e^{-\lambda} \lambda^k}{k!}P(X=k)=k!e−λλk
Where:
c. Normal Distribution
A continuous probability distribution used to model many natural phenomena. Its probability
density function (PDF) is given by:
Where:
Key Properties:
5. Applications
1. Binomial Distribution:
o Modeling success/failure experiments (e.g., coin flips, defect detection).
2. Poisson Distribution:
o Modeling rare events (e.g., number of phone calls per hour).
3. Normal Distribution:
o Modeling natural phenomena (e.g., heights, weights, test scores).
Unit 3
1. Correlation Analysis
Correlation measures the strength and direction of a linear relationship between two variables.
Used for ordinal data or when the relationship between variables is not linear. It measures the
degree of association between two ranked variables.
Where:
Key Features:
A widely used measure for linear correlation between two continuous variables.
Where:
Alternatively:
Properties:
1. rrr lies between −1-1−1 and 111.
2. r>0r > 0r>0: Positive correlation.
3. r<0r < 0r<0: Negative correlation.
4. r=0r = 0r=0: No linear correlation.
2. Regression Analysis
Regression analysis estimates the relationship between dependent and independent variables.
Y=a+bXY = a + bXY=a+bX
Where:
X=c+dYX = c + dYX=c+dY
1. The slope bbb indicates the direction and strength of the relationship.
2. The intercept aaa provides the starting value of YYY when X=0X = 0X=0.
3. Goodness of fit can be evaluated using the coefficient of determination (R2R^2R2).
1. Correlation:
o Measures the strength and direction of the relationship.
o Does not differentiate between dependent and independent variables.
2. Regression:
o Provides a functional relationship between variables.
o Differentiates between dependent and independent variables.
3. If r=0r = 0r=0, there is no linear relationship, and the regression line will be horizontal
(slope = 0).
4. The coefficient of determination (R2R^2R2):
R2=r2R^2 = r^2R2=r2
Summary Table
Unit 2
Concept
Time series analysis involves studying data points collected or recorded at specific time
intervals. It helps identify patterns, trends, and seasonality to make forecasts and informed
decisions.
1. Additive Model:
The observed value at any time (Y) is expressed as the sum of its components:
Where:
Trend Analysis
1. Linear Equation:
Y=a+bXY = a + bXY=a+bX
Where:
Steps:
Determine the values of aaa and bbb using the formulas: a=∑Y−b∑Xn,b=n∑(XY)
−∑X∑Yn∑X2−(∑X)2a = \frac{\sum Y - b \sum X}{n}, \quad b = \frac{n\sum(XY) - \
sum X \sum Y}{n\sum X^2 - (\sum X)^2}a=n∑Y−b∑X,b=n∑X2−(∑X)2n∑(XY)
−∑X∑Y
Index Numbers
Meaning
Index numbers measure relative changes in variables over time, helping compare different
periods or places.
Uses the previous year as the base year for each calculation.
Formula: CI=Price in Current YearPrice in Previous Year×100CI = \frac{\text{Price in
Current Year}}{\text{Price in Previous Year}} \times
100CI=Price in Previous YearPrice in Current Year×100
Quantity Index:
1. Laspeyres and Paasche formulas can also calculate quantity indices by interchanging PPP
with QQQ.
Unit 1
Statistics
Meaning
Statistics is the study of collecting, organizing, analyzing, interpreting, and presenting numerical
data to make informed decisions.
Scope of Statistics
1. Descriptive Statistics: Summarizes and describes the main features of a dataset using
measures like mean, median, and standard deviation.
2. Inferential Statistics: Makes predictions or inferences about a population based on a
sample.
3. Applications:
o Business: Sales forecasting, quality control
o Economics: Inflation, GDP analysis
o Healthcare: Medical research, patient statistics
o Social Sciences: Survey analysis
Types of Statistics
1. Descriptive Statistics:
o Central tendency measures (mean, median, mode)
o Measures of dispersion (range, standard deviation)
2. Inferential Statistics:
o Hypothesis testing
o Regression and correlation
o Probability distributions
Functions of Statistics
Limitations of Statistics
2. Median
3. Mode
4. Quartiles
Measures of Dispersion
1. Range
Difference between the third quartile (Q3) and first quartile (Q1).
IQR=Q3−Q1\text{IQR} = Q3 - Q1IQR=Q3−Q1
3. Mean Deviation
5. Variance
1. Skewness
2. Kurtosis
Measures the "tailedness" or sharpness of the peak of the data distribution.