0% found this document useful (0 votes)
11 views14 pages

Descriptive Statistics

Descriptive statistics is a branch of statistics focused on summarizing and describing data sets through measures of central tendency (mean, median, mode) and measures of dispersion (range, variance, standard deviation). Each measure provides insights into the characteristics and variability of the data, with the mean being sensitive to outliers while the median is more robust. Data visualization techniques, such as histograms and pie charts, are often used alongside descriptive statistics to enhance understanding of the data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views14 pages

Descriptive Statistics

Descriptive statistics is a branch of statistics focused on summarizing and describing data sets through measures of central tendency (mean, median, mode) and measures of dispersion (range, variance, standard deviation). Each measure provides insights into the characteristics and variability of the data, with the mean being sensitive to outliers while the median is more robust. Data visualization techniques, such as histograms and pie charts, are often used alongside descriptive statistics to enhance understanding of the data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Descriptive

Statistics

Ed neil

Maratas
Descriptive
Statistics
Descriptive statistics is a branch of statistics that deals with
summarizing and describing data. It provides a way to organize
and understand data sets by presenting them in a meaningful way.
Descriptive statistics can be used to describe the characteristics of
a population or sample.
Measures of Central Tendency
Measures of central tendency are used to describe the center or average of a
data set. They provide a single value that represents the typical or central value
of the data. There are three main measures of central tendency: the mean, the
median,
and the mode. 1 Mean
The mean is the most commonly used measure of central outliers,
tendency and is calculated by adding up all the values in a data making it a
set and dividing by the number of values. It is sensitive to more
outliers, meaning extreme values can significantly affect the robust
mean. measure of
3 Mode central
tendency
2 Median than the
The median is the middle value in a data set mean.
when the values are arranged in order from
least to greatest. It is not affected by
The mode is the value that occurs most frequently in a data set.
It is useful for describing the most common or typical value in a
data set.

Mean
The mean is the average of a set of numbers. It is calculated by summing all the values in a data set and
dividing by the total number of values. The mean is a useful measure of central tendency, but it is sensitive
to outliers, which can skew the results.

Calculation Example Applications


Sum of all values The mean of the numbers 2, 4, The mean is used in many divided by the number 6, and 8
is 2  4  6  8 / 4 = different fields, including of values. 5. statistics, finance, and
economics.
Median
The median is the middle value in a sorted data set. It divides the data set in half, with half of the values
being less than the median and half of the values being greater than the median. It is a robust measure of
central tendency, meaning it is not affected by outliers.
Finding the values is even, the median is Applications
the average of the two
To find the median, first, sort The median is often used in
middle values.
Median the data set from fields such as economics
Example The median of and finance to describe the
least to greatest. If the number of
the numbers 2, 4, 6, and typical value of a data set
values is odd, the median is the
8 is 4  6 / 2  5. that may contain outliers.
middle value. If the number of
Mode
The mode is the most frequent value in a data set. It is the value
that occurs most often in the data set. The mode is a useful
measure of central tendency, but it can be difficult to
interpret if there are multiple modes in the data set.
Finding the
Mode
To find the mode, count the number of times
each value occurs in the data set.

Example
The mode of the numbers 2, 4, 6, 6, and 8 is 6, because
it occurs twice, more than any other value.
Applications
The mode is used in a variety of fields, including
marketing, retail, and manufacturing.
Measures of
Measures of dispersion describe the spread or Dispersion
variability of a data set. They tell us how much the data points are
spread out from the center of the data set. The most common
measures of dispersion are range, variance, and standard deviation.

Range
The range is the difference between the highest and
lowest values in a data set. It is a simple measure of
dispersion, but it is highly sensitive to outliers. Variance
The variance is a measure of how spread out the data points are
from the mean. It is calculated by squaring the deviations of each
data point from the mean and then averaging those squared
deviations.

Standard
DeviationThe standard deviation is the square root of the
variance. It is a measure of how spread out the data points are
from the mean. It is expressed in the same units as the original
data, making it easier to interpret.
Range
The range is the simplest measure of dispersion. It is calculated
by subtracting the smallest value from the largest value in a data
set. The range is sensitive to outliers, meaning extreme values
can greatly affect the range.

Data Set Smallest Largest Range Value Value


8 6
2, 4, 6, 8 2
100 99
1, 3, 5, 7, 1
100

Variance
Variance is a measure of how spread out the data points are from the
mean. It is calculated by squaring the deviations of each data point from
the mean and then averaging those squared deviations. Variance is not

3
expressed in the same units as the original data, making it harder to
interpret. Step 1: Find the
MeanCalculate the average of the data set.

Step 2: Calculate
DeviationsSubtract the mean from each data point.

Step 3: Square
DeviationsSquare each of the deviations.

Step 4: Average
Squared DeviationsAdd up the squared deviations
and divide by the number of values in the data set.
Standard Deviation
Standard deviation is a measure of how spread out the data points are from the mean. It is calculated by
taking the square root of the variance. Standard deviation is expressed in the same units as the original
data, making it easier to interpret.
Visual Normal
RepresentatioThe standard deviation is often represented
DistributionIn a normal distribution, about 68% of the data falls
nvisually on a histogram as the spread of the within one standard
deviation of the mean, about 95% data points around the mean. falls
within two standard deviations, and about 99.7%
falls within three standard deviations.

Data Visualization
Data visualization is the process of creating visual representations of data to
make it easier to understand and interpret. Descriptive statistics are often
used in conjunction with data visualization to present a clear and
comprehensive picture of the data.
1 Histograms 2 Pie Charts
Histograms are used to Pie charts are used to
display the distribution show the proportions
of numerical data. of different categories
within a data set.

3 Box Plots
Box plots are used to display the distribution of numerical data, including the median,
quartiles, and outliers.

You might also like