0% found this document useful (0 votes)

42 views15 pages

DeMeasure of Central Tendency and Dispersion

Ghfsgxgjjgdgj

Uploaded by

alyansandhu33

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views15 pages

DeMeasure of Central Tendency and Dispersion

Ghfsgxgjjgdgj

Uploaded by

alyansandhu33

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

DeMeasure of

central
tendency and
dispersion.
Descriptive statistics summarize and describe the main features of a dataset. Key points include:

1. Measures of Central Tendency:

o Mean: Average value of the dataset.

o Median: Middle value when data is ordered.

o Mode: Most frequently occurring value.

2. Measures of Variability (Dispersion):

o Range: Difference between the maximum and minimum values.

o Variance: Measure of how data points differ from the mean.

o Standard Deviation: Square root of variance, showing average spread around the mean.

o Interquartile Range (IQR): Spread of the middle 50% of the data.

3. Measures of Distribution Shape:

o Skewness: Degree of asymmetry in data distribution.

o Kurtosis: Measure of the "tailedness" of the distribution.

4. Visual Representations:

o Histograms: Show frequency distribution.

o Boxplots: Display variability and detect outliers.

o Bar Charts and Pie Charts: Summarize categorical data.

5. Summarizing Categorical Data:

o Frequencies and proportions for each category.

6. Purpose:

o Provide an overview of the dataset.

o Highlight patterns, trends, and potential outliers.

o Serve as a foundation for inferential statistics.

Inferential Statistics

Inferential statistics use sample data to make generalizations, predictions, or decisions about a larger
population. Key points include:

Purpose: Draw conclusions beyond the immediate data.

Techniques:

Hypothesis Testing: Assess claims (e.g., t-tests, chi-square tests).

Confidence Intervals: Estimate population parameters.

Regression Analysis: Model relationships between variables.

Key Concept: Uses probability to account for uncertainty.

Comparison Descriptive and Inferential

Statistics
Aspect Descriptive Statistics Inferential Statistics

Definition Summarizes and describes data characteristics. Makes predictions or generalizations

about a population from a sample.

Purpose To provide a clear overview of the data. To infer insights and test hypotheses

about the larger population.

Scope Limited to the data at hand. Goes beyond the data to make broader

conclusions.

Techniques Mean, median, mode, variance, charts,

and graphs. Hypothesis testing,

confidence intervals.
Data Requirement Works with the entire dataset or sample. Uses a sample to represent a

population.

Uncertainty No uncertainty involved; purely factual. Involves uncertainty and

probability

estimates.

Examples Calculating average sales for a month. Predicting future sales based on

Data Distribution
Data distribution refers to how data values are spread across a range. Key points include:

Types of Distributions:

Normal Distribution: Symmetrical, bell-shaped curve.

Skewed Distribution: Asymmetrical; skewed left (negative) or right (positive).

Uniform Distribution: All values have equal frequency.

Bimodal/Multimodal: Two or more peaks in the data.

Key Features:

Center: Mean, median, mode.

Spread: Range, variance, standard deviation.

Shape: Symmetry, skewness, and kurtosis.

Importance:

Helps visualize patterns, trends, and outliers.

Aids in selecting appropriate statistical methods for analysis.

Frequency Distribution

Frequency distribution is a summary that shows how often each value or range of values occurs in a
dataset.

Key Points:

Components:

Class Intervals: Ranges of values.

Frequencies: Counts of occurrences in each range.

Types:

Tabular: Organized in a table.

Graphical: Represented as histograms, bar charts, or pie charts.

Purpose:

Simplifies large datasets.

Highlights patterns and trends.

Frequency Distribution In Intervals
Frequency distribution in intervals organizes data into non-overlapping ranges (intervals) and counts
the number of data points in each range.

Key Steps:

1. Determine the Range: Subtract the smallest value from the largest.

2. Choose the Number of Intervals: Typically 5-10, depending on data size.

3. Calculate Interval Width:

o Divide the range by the number of intervals.

o Adjust to a convenient number if needed.

4. Create Intervals: Ensure they are continuous and non-overlapping.

5. Count Frequencies: Tally how many data points fall within each interval.

Example:

For data: 5, 8, 12, 15, 18, 22, 25, 28, 30.

Interval Frequency

5–10 2

11–15 2

16–20 1

21–25 2

26–30 2
Central Tendency
Central tendency refers to the measure that identifies the center or typical value of a dataset. It
summarizes the data with a single value.

Key Measures:

1. Mean (Average):

o Sum of all values divided by the number of values.

o Sensitive to outliers.

2. Median:

o Middle value when data is sorted.

o If even number of values, average of the two middle ones.

o Not affected by outliers.

3. Mode:

o Most frequently occurring value(s) in the dataset.

o Useful for categorical data.

Importance:

 Provides a summary of the data's central point.

 Helps in understanding data distribution and comparison.

Mode
Mode is the value that occurs most frequently in a dataset.

Key Points:

1. Characteristics:

o There can be:

 No mode: If all values occur with equal frequency.

 Unimodal: One mode (single most frequent value).

 Bimodal: Two modes.

 Multimodal: More than two modes.

o Suitable for both quantitative and qualitative data.

2. Formula (Grouped Data):

Mode=L+(fm−f1(fm−f1)+(fm−f2))×h\text{Mode} = L + \left( \frac{f_m - f_{1}}{(f_m - f_{1}) + (f_m - f_{2})}

\right) \times hMode=L+((fm−f1)+(fm−f2)fm−f1)×h

o LLL: Lower boundary of the modal class.

o fmf_mfm: Frequency of the modal class.

o f1f_{1}f1: Frequency of the class before the modal class.

o f2f_{2}f2: Frequency of the class after the modal class.

o hhh: Class width.

3. Advantages:

o Simple to calculate.

o Not influenced by extreme values.

4. Uses:

o Ideal for identifying the most common category or trend in data.

Median
Median is the middle value of a dataset when the data is arranged in ascending or descending order.

Key Points:

1. How to Calculate:

o Odd Number of Values: The median is the middle value.

o Even Number of Values: The median is the average of the two middle values.

2. Steps:
1. Arrange data in order.

2. Identify the middle value(s).

3. Compute as needed.

3. Formula for Position:

Median Position=n+12\text{Median Position} = \frac{n + 1}{2}Median Position=2n+1

o nnn: Total number of values.

4. Advantages:

o Not affected by outliers or extreme values.

o Represents the central location in skewed distributions.

5. Example:

o Dataset: 5, 8, 12, 15, 20

 Median = 12 (middle value).

o Dataset: 5, 8, 12, 15

Median = 8+122=10\frac{8 + 12}{2} = 1028+12=10.

Mean
Mean is the arithmetic average of a dataset, representing the central value.

Key Points:

1. How to Calculate:

Mean=Sum of all valuesNumber of values\text{Mean} = \frac{\text{Sum of all values}}{\text{Number of

values}}Mean=Number of valuesSum of all values

o Add all the data points.

o Divide the total by the number of data points.

2. Example:
o Data: 5, 10, 15.

o Mean: 5+10+153=10\frac{5 + 10 + 15}{3} = 1035+10+15=10.

3. Advantages:

o Simple to calculate.

o Uses all data points, providing a comprehensive measure.

4. Disadvantages:

o Sensitive to outliers (extreme values).

5. Types of Mean:

o Arithmetic Mean: Standard average calculation.

o Weighted Mean: Adjusts for the importance (weights) of values.

Measure of Disperse

Measures of dispersion quantify the spread or variability in a dataset. They indicate how much the data
points deviate from the central value (e.g., mean or median). Key measures include:

1. Range:

 Difference between the highest and lowest values.

 Formula: Range=Maximum value−Minimum value\text{Range} = \text{Maximum value} -

\text{Minimum value}Range=Maximum value−Minimum value

 Advantage: Simple to calculate.

 Disadvantage: Sensitive to outliers.

2. Variance:

 Measure of how much each data point deviates from the mean.

 Formula: Variance=∑(xi−μ)2N\text{Variance} = \frac{\sum{(x_i - \mu)^2}}{N}Variance=N∑(xi−μ)2

o xix_ixi: Each data point.

o μ\muμ: Mean of the dataset.

o NNN: Number of data points.

 Advantage: Considers all data points.

 Disadvantage: Units are squared, which makes interpretation less intuitive.

3. Standard Deviation:

 Square root of the variance; represents average deviation from the mean.

 Formula: Standard Deviation=Variance\text{Standard Deviation} =

\sqrt{\text{Variance}}Standard Deviation=Variance

 Advantage: Same units as the original data, easier to interpret.

 Disadvantage: Still sensitive to outliers.

4. Interquartile Range (IQR):

 Range between the first (Q1) and third (Q3) quartiles, representing the middle 50% of the data.

 Formula: IQR=Q3−Q1\text{IQR} = Q3 - Q1IQR=Q3−Q1

 Advantage: Not affected by outliers.

 Disadvantage: Does not capture all variability in the data.

Importance:

 Helps understand the spread and consistency of data.

 Larger values indicate more spread, while smaller values suggest more consistency.

Range

Range is a measure of dispersion that shows the difference between the maximum and minimum values
in a dataset.

Key Points:
1. Formula:

Range=Maximum value−Minimum value\text{Range} = \text{Maximum value} - \text{Minimum

value}Range=Maximum value−Minimum value

2. Example:

o Data: 5, 8, 12, 15, 20

o Range = 20−5=1520 - 5 = 1520−5=15

3. Advantages:

o Simple and quick to calculate.

o Provides a basic sense of the spread of the data.

4. Disadvantages:

o Sensitive to outliers: A single extreme value can dramatically affect the range, making it
less reliable for skewed data distributions.

5. Use:

o Provides a quick, basic indication of the variability in a dataset, though it doesn't offer
detailed insight compared to other measures like standard deviation or interquartile
range

Variance
Variance measures the average squared deviation of each data point from the mean, indicating how
spread out the data is.

Key Points:

1. Formula:

o For a population: Variance(σ2)=∑(xi−μ)2N\text{Variance} (\sigma^2) = \frac{\sum{(x_i -

\mu)^2}}{N}Variance(σ2)=N∑(xi−μ)2

o For a sample: Variance(s2)=∑(xi−xˉ)2n−1\text{Variance} (s^2) = \frac{\sum{(x_i -

\bar{x})^2}}{n-1}Variance(s2)=n−1∑(xi−xˉ)2

 xix_ixi: Each data point

 μ\muμ or xˉ\bar{x}xˉ: Mean of the population or sample

 NNN or nnn: Number of data points

2. Example:

o Dataset: 3, 7, 8, 12, 15.

o Mean: 3+7+8+12+155=9\frac{3+7+8+12+15}{5} = 953+7+8+12+15=9.

o Variance: (3−9)2+(7−9)2+(8−9)2+(12−9)2+(15−9)25=18.8\frac{(3-9)^2 + (7-9)^2 + (8-9)^2

+ (12-9)^2 + (15-9)^2}{5} = 18.85(3−9)2+(7−9)2+(8−9)2+(12−9)2+(15−9)2=18.8.

3. Advantages:

o Uses all data points, providing a comprehensive measure of spread.

o Useful for statistical modeling and analysis.

4. Disadvantages:

o The unit of variance is the square of the original data units, which can be harder to
interpret.

o Sensitive to outliers (extreme values).

Importance:

 Variance helps understand the degree of variability in the dataset.

 A higher variance indicates that the data points are more spread out from the mean, while a
lower variance suggests the data is more tightly clustered

Standard Deviation
Standard deviation measures the average distance between each data point and the mean of the
dataset.

Key Points:

 It is the square root of the variance.

 Represents how spread out the data is.

 A larger standard deviation indicates more variability, while a smaller one means the data is
more consistent.
Formula:

σ=∑(xi−μ)2N\sigma = \sqrt{\frac{\sum{(x_i - \mu)^2}}{N}}σ=N∑(xi−μ)2

The Normal Curve

The normal curve, also known as the normal distribution or Gaussian distribution, is a symmetric, bell-
shaped curve that represents the distribution of many types of data.

Key Points:

1. Characteristics:

o Symmetrical: The left and right sides are mirror images.

o Mean = Median = Mode: All measures of central tendency are equal.

o Bell-shaped: The curve is highest at the mean and tapers off as it moves away from the
center.

2. Properties:

o The total area under the curve equals 1 (or 100%).

o About 68% of data falls within one standard deviation from the mean.

o About 95% of data falls within two standard deviations.

o About 99.7% of data falls within three standard deviations (empirical rule).

3. Uses:

o Describes many natural phenomena (e.g., height, test scores).

o Basis for statistical inference, including hypothesis testing and confidence intervals.

4. Shape:

o Controlled by the mean (center) and standard deviation (spread).

o A larger standard deviation results in a wider curve, while a smaller one makes it
narrower.

Example:
 In a dataset of exam scores, most students' scores cluster around the average, with fewer
students scoring much higher or lower, forming a normal distribution.

Skewness
Skewness refers to the measure of asymmetry or distortion in the distribution of data. It indicates
whether data is skewed to the left (negative skew) or to the right (positive skew).

Key Points:

1. Types of Skewness:

o Positive Skew (Right Skew): The right tail (higher values) is longer or fatter than the left.
The mean is greater than the median.

o Negative Skew (Left Skew): The left tail (lower values) is longer or fatter than the right.
The mean is less than the median.

o Zero Skew: Symmetrical distribution, like the normal distribution, where the mean,
median, and mode are all the same.

2. Formula: Skewness can be calculated using the formula:

Skewness=n(n−1)(n−2)∑(xi−xˉs)3\text{Skewness} = \frac{n}{(n-1)(n-2)} \sum \left( \frac{x_i - \bar{x}}{s}

\right)^3Skewness=(n−1)(n−2)n∑(sxi−xˉ)3

o xix_ixi: Data point, xˉ\bar{x}xˉ: Mean, sss: Standard deviation, nnn: Number of data
points.

3. Interpretation:

o Positive Skew: Tail on the right, with most data on the left.

o Negative Skew: Tail on the left, with most data on the right.

o Skewness ≈ 0: Data is approximately symmetric.

4. Impact:

o Skewness affects the mean and median. In skewed data, the mean is pulled in the
direction

Kelly JB. The Determination of Child Custody
100% (1)
Kelly JB. The Determination of Child Custody
23 pages
MegaBrain Report Volume 3 Number 3
100% (9)
MegaBrain Report Volume 3 Number 3
74 pages
Typical EfW Plant Commissioning Plan Feb 2010
No ratings yet
Typical EfW Plant Commissioning Plan Feb 2010
176 pages
3BSE020923R5001 CIO S800 Install
No ratings yet
3BSE020923R5001 CIO S800 Install
284 pages
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
100% (1)
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
33 pages
Users Manual HP ENVY LAPTOP
No ratings yet
Users Manual HP ENVY LAPTOP
67 pages
Master Thesis
No ratings yet
Master Thesis
50 pages
CV
No ratings yet
CV
58 pages
Offshor Mooring System
No ratings yet
Offshor Mooring System
6 pages
PSM 2020N
No ratings yet
PSM 2020N
399 pages
Conformity, Compliance and Obedience
No ratings yet
Conformity, Compliance and Obedience
9 pages
CE 459 Statistics: Assistant Prof. Muhammet Vefa AKPINAR
No ratings yet
CE 459 Statistics: Assistant Prof. Muhammet Vefa AKPINAR
211 pages
Oil On Trouble Waters-Ogoni Land
No ratings yet
Oil On Trouble Waters-Ogoni Land
22 pages
Quality of Care Between Donabedian Model and Iso9001v2008 PDF
100% (1)
Quality of Care Between Donabedian Model and Iso9001v2008 PDF
14 pages
Omega: Mahdi Alinaghian, Nadia Shokouhi
No ratings yet
Omega: Mahdi Alinaghian, Nadia Shokouhi
15 pages
Handout-A-Preliminaries (Advance Statistics)
No ratings yet
Handout-A-Preliminaries (Advance Statistics)
29 pages
Guidance Notes On Completing The Application Form
No ratings yet
Guidance Notes On Completing The Application Form
10 pages
IOT Assignment-12 Solution
No ratings yet
IOT Assignment-12 Solution
7 pages
Statistics: The Language of Facts: Group 6
No ratings yet
Statistics: The Language of Facts: Group 6
65 pages
2022-2023 Eğitim-Öğretim Yili Caner Özen İlkokulu 3. Siniflar İngilizce Dersi Ünitelendirilmiş Yillik Ders Plani
No ratings yet
2022-2023 Eğitim-Öğretim Yili Caner Özen İlkokulu 3. Siniflar İngilizce Dersi Ünitelendirilmiş Yillik Ders Plani
11 pages
IT SBA CXC Question 2 Markscheme Mona High School 2011
No ratings yet
IT SBA CXC Question 2 Markscheme Mona High School 2011
16 pages
Budget of Minority
No ratings yet
Budget of Minority
18 pages
Jon Balzotti - Technical Communication - A Design-Centric Approach, 1e-Routledge (2022)
No ratings yet
Jon Balzotti - Technical Communication - A Design-Centric Approach, 1e-Routledge (2022)
483 pages
Psych 2220 Syllabus
No ratings yet
Psych 2220 Syllabus
7 pages
It0089 Finalreviewer
No ratings yet
It0089 Finalreviewer
143 pages
CIMA-F2 Area B - Self Study Guide: Over View of The Syllabus Area C
No ratings yet
CIMA-F2 Area B - Self Study Guide: Over View of The Syllabus Area C
3 pages
Statistics
100% (6)
Statistics
211 pages
Exercises - False Friend
No ratings yet
Exercises - False Friend
2 pages
Module 2 - Statistical Foundations
No ratings yet
Module 2 - Statistical Foundations
108 pages
Unit 8. Data Analysis
No ratings yet
Unit 8. Data Analysis
69 pages
3rdTrimesterMAT130 takeawayCAT
No ratings yet
3rdTrimesterMAT130 takeawayCAT
2 pages
Lorenzo M. de Vera - Resume For SWS
No ratings yet
Lorenzo M. de Vera - Resume For SWS
2 pages
Linking Words English
No ratings yet
Linking Words English
2 pages
Lesson 02 Probability and Statistics
No ratings yet
Lesson 02 Probability and Statistics
127 pages
Probability - Handout
No ratings yet
Probability - Handout
9 pages
Statistics
No ratings yet
Statistics
68 pages
Data Management
No ratings yet
Data Management
48 pages
Maude Hoc Bong 2017
No ratings yet
Maude Hoc Bong 2017
6 pages
Calculate Mean, Median, Mode, Variance and Standard Deviation For Column A
No ratings yet
Calculate Mean, Median, Mode, Variance and Standard Deviation For Column A
22 pages
Offer Salary Package
No ratings yet
Offer Salary Package
4 pages
Statistical Methods
No ratings yet
Statistical Methods
43 pages
Pages From AISCDesignExamples
No ratings yet
Pages From AISCDesignExamples
3 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Stats For Data Science
No ratings yet
Stats For Data Science
21 pages
Self-Concept Questionnaire (SCQ)
No ratings yet
Self-Concept Questionnaire (SCQ)
12 pages
E Book - Unit 4
No ratings yet
E Book - Unit 4
12 pages
COURSE CODE 8614 Assignment 2
No ratings yet
COURSE CODE 8614 Assignment 2
9 pages
Statistics & Psychology
No ratings yet
Statistics & Psychology
47 pages
EDA Chapter 1.1 and 1.4
No ratings yet
EDA Chapter 1.1 and 1.4
2 pages
Statistics For Data Analysis
No ratings yet
Statistics For Data Analysis
13 pages
Stat
No ratings yet
Stat
5 pages
Topic Review - Statistics
No ratings yet
Topic Review - Statistics
5 pages
Statistics SS2020
No ratings yet
Statistics SS2020
12 pages
MATM111
No ratings yet
MATM111
8 pages
Advance Statistics For Data Science and Data Analysis
No ratings yet
Advance Statistics For Data Science and Data Analysis
47 pages
Stormweaver Ii - Chapter 9
No ratings yet
Stormweaver Ii - Chapter 9
17 pages
Stats Assingment
No ratings yet
Stats Assingment
12 pages
Chapter2-Statistical Analysis
No ratings yet
Chapter2-Statistical Analysis
86 pages
365 Data Science - Statistics: Glossary Section Lesson Word
No ratings yet
365 Data Science - Statistics: Glossary Section Lesson Word
5 pages
Intern Report
No ratings yet
Intern Report
16 pages
Stats Reviewer
No ratings yet
Stats Reviewer
5 pages
Lecture 9 Statistical Learning
No ratings yet
Lecture 9 Statistical Learning
3 pages
Combinepdf
No ratings yet
Combinepdf
137 pages
Statistics, Statistical Modelling & Data Analytics
No ratings yet
Statistics, Statistical Modelling & Data Analytics
68 pages
Statistical Data Science
No ratings yet
Statistical Data Science
5 pages
SPROB Polished
No ratings yet
SPROB Polished
8 pages
MMW Nursing
No ratings yet
MMW Nursing
23 pages
Basic Statistics Notes
No ratings yet
Basic Statistics Notes
10 pages
Stat Quick Overview
No ratings yet
Stat Quick Overview
35 pages
Psychology Project
No ratings yet
Psychology Project
14 pages
Karim, Saman
No ratings yet
Karim, Saman
21 pages
Statistics - Compendium - DMS IIT DELHI - 2025
No ratings yet
Statistics - Compendium - DMS IIT DELHI - 2025
18 pages
Ge8 Statistics
No ratings yet
Ge8 Statistics
2 pages
Statistics and Its Types (v1.0)
No ratings yet
Statistics and Its Types (v1.0)
6 pages
Descriptive Analytics Notes
No ratings yet
Descriptive Analytics Notes
6 pages
Unit 2 DS PDF
No ratings yet
Unit 2 DS PDF
97 pages
Business Analytics
No ratings yet
Business Analytics
44 pages
Statistics
No ratings yet
Statistics
33 pages
Unit-2 Data Analytics Approaches
No ratings yet
Unit-2 Data Analytics Approaches
24 pages
Ai - Ssmda
No ratings yet
Ai - Ssmda
142 pages
Statistics
No ratings yet
Statistics
63 pages
Unit II TYCS DS
No ratings yet
Unit II TYCS DS
176 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
15 pages
Statistics
No ratings yet
Statistics
45 pages
Importance of Descriptive Statistics
No ratings yet
Importance of Descriptive Statistics
59 pages
Population
No ratings yet
Population
27 pages
MMW Reviewer
No ratings yet
MMW Reviewer
3 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet