Descriptive Statistics
Descriptive Statistics
Statistics
MODULE 2
Data – descriptive facts and figures collected, analyzed, and summarized for presentation and
interpretation.
Data transformation – Variable View (bottom left) set the Values and Measure
Creating Distributions from Data
SPSS – Frequency Tables
Creating Distributions from Data
Frequency Table for a Quantitative/Numerical Data What numerical data can be
grouped into bins/categories?
◦ 1. Determine the number of nonoverlapping bins.
◦ 2. Determine the width of each bin. (largest – smallest data value / number of bins)
◦ 3. Determine the bin limits (upper and lower limit)
Variance- (S²) = variability based on the deviation from the mean =∑ (xi –x bar)^2 / n-1(unbiased
estimate of the population variance)
Standard Deviation = √S² (square root of the variance) ; measured in the same units as the
original data.
•Who is my audience?
Bubble Chart – used for three variable visualization in a single plot. Each bubble represents
magnitude or size.
Measures of Shape
Normal Distribution
Detection of outliers.
It simplifies interpretation and makes it easier to draw meaningful conclusions from statistical
analyses.
Measures of Shape
Skewness – measure of symmetry or more precisely, the lack of symmetry. A dataset is symmetric if it
looks the same to the left and to the right of the center point.
Pearson’s correlation of Skewness = Mean-Median/Standard Deviation
- Between -0.5 and +0.5 (nearly symmetrical)
- Between -1 and -0.5 (negative skewed); +1 and 0.5 (positive skewed)
- lower than -1 and higher than 1 (extremely skewed)
-
Kurtosis – measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution.
- Mesokurtic; Leptokurtic; Platykurtic
Measures of Associations
Nominal/Categorical Data
Crosstabulations/Contingency Tables
•The chi-square statistic is a measure of the
difference between the observed and
expected frequencies. A larger chi-square
value indicates a greater deviation from
expected values.
•The null hypothesis assumes independence,
and a significant chi-square test suggests that
the variables are not independent.
Measures of Associations
Continuous/numerical variables
- magnitude of the covariance is difficult to interpret. If value is >0 = they are positively related; <0
= they are negatively related; =0 not related.
= CORREL(xdatarange;ydatarange)
Summary Tables