0% found this document useful (0 votes)

41 views29 pages

Presentation 4

Uploaded by

mujtaba

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views29 pages

Presentation 4

Uploaded by

mujtaba

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 29

Statistics

Basic terminologies and types of statistics

What is Statistics?
• Statistics is simply defined as the study and manipulation of data. As we have
already discussed in the introduction that statistics deals with the analysis and
computation of numerical data. Let us see more definitions of statistics given
by different authors here.
• According to Merriam-Webster dictionary, statistics is defined as “classified
facts representing the conditions of a people in a state – especially the facts
that can be stated in numbers or any other tabular or classified arrangement”.
• According to statistician Sir Arthur Lyon Bowley, statistics is defined as
“Numerical statements of facts in any department of inquiry placed in relation
to each other”.
Basics of Statistics

• The basics of statistics include the measure of central tendency and the
measure of dispersion. The central tendencies are mean, median and
mode and dispersions comprise variance and standard deviation.
• Mean is the average of the observations.
• Median is the central value when observations are arranged in order.
• The mode determines the most frequent observations in a data set.
• Variation is the measure of spread out of the collection of data.
Standard deviation is the measure of the dispersion of data from the
mean. The square of standard deviation is equal to the variance.
Mathematical Statistics

• Mathematical statistics is the application of Mathematics to Statistics,

which was initially conceived as the science of the state — the
collection and analysis of facts about a country: its economy, and,
military, population, and so forth.

• Mathematical techniques used for different analytics include

mathematical analysis, linear algebra, stochastic analysis, differential
equation and measure-theoretic probability theory.
Types of Statistics

• Basically, there are two types of statistics.

• Descriptive Statistics
• Inferential Statistics

• In the case of descriptive statistics, the data or collection of data is

described in summary. But in the case of inferential stats, it is used to
explain the descriptive one. Both these types have been used on large
scale.
Descriptive Statistics

• The data is summarised and explained in descriptive statistics. The

summarization is done from a population sample utilising several
factors such as mean and standard deviation. Descriptive statistics is a
way of organising, representing, and explaining a set of data using
charts, graphs, and summary measures. Histograms, pie charts, bars,
and scatter plots are common ways to summarise data and present it in
tables or graphs. Descriptive statistics are just that: descriptive. They
don’t need to be normalised beyond the data they collect.
Inferential Statistics

• We attempt to interpret the meaning of descriptive statistics using

inferential statistics. We utilise inferential statistics to convey the
meaning of the collected data after it has been collected, evaluated,
and summarised. The probability principle is used in inferential
statistics to determine if patterns found in a study sample may be
extrapolated to the wider population from which the sample was
drawn. Inferential statistics are used to test hypotheses and study
correlations between variables, and they can also be used to predict
population sizes. Inferential statistics are used to derive conclusions
and inferences from samples, i.e. to create accurate generalisations.
Measure of Central Tendency
• Measures of central tendency are statistical measures that describe the
center or average of a set of data points. They provide a single value that
represents the central or typical value of a dataset. The three main
measures of central tendency are the mean, median, and mode.
1. Mean:
• The mean, also known as the average, is calculated by summing up all the
values in a dataset and then dividing by the number of values.
• Formula: Mean=Sum of all values/Number of all values.
• Example: For the dataset {10, 15, 20, 25, 30}, the mean is
10+15+20+25+30=20
Measure of Central Tendency Cont..
2. Median:
• The median is the middle value in a dataset when it is arranged in
ascending or descending order.
• If there is an even number of values, the median is the average of the
two middle values.
• Example: For the dataset {10, 15, 20, 25, 30}, the median is 20. For
{10, 15, 20, 25}, the median is 20+25/2 =22.5
Measure of Central Tendency Cont..
3. Mode:
• The mode is the value that appears most frequently in a dataset.
• A dataset may have no mode, one mode (unimodal), or more than one mode
(multimodal).
• Example: For the dataset {10, 15, 20, 20, 25, 30}, the mode is 20.
• These measures of central tendency provide different insights into the central
value of a dataset, and the choice of which one to use depends on the nature
of the data and the specific goals of the analysis.
• It's important to note that each measure has its strengths and limitations. The
mean is sensitive to extreme values (outliers), while the median is more
robust in the presence of outliers. The mode is especially useful for
categorical data or when identifying the most common category is essential.
Measure of Dispersion
• Measures of dispersion are descriptive statistics that describe how
similar a set of scores are to each other
• The more similar the scores are to each other, the lower the measure of
dispersion will be.
• The less similar the scores are to each other, the higher the measure of
dispersion will be.
• In general, the more spread out a distribution is, the larger the measure of
dispersion will be.
Measure of Dispersion Cont..
125
• Which of the distributions 100
75
of scores has the larger 50
25
dispersion? 0
1 2 3 4 5 6 7 8 9 10

• The upper distribution has more 125

dispersion because the 100
75
scores are more spread out 50
25
• That is, they are less similar to each other 0
1 2 3 4 5 6 7 8 9 10
Measure of Dispersion Cont..
• There are three main measures of dispersion:
• The range
• The semi-interquartile range (SIR)
• Variance / standard deviation
• The Range.
• The range is defined as the difference between the largest score in the set of
data and the smallest score in the set of data, XL - XS
• What is the range of the following data:
4 8 1 6 6 2 9 3 6 9
• The largest score (XL) is 9; the smallest score (XS) is 1; the range is XL - XS = 9 -
1=8
When To Use the Range
• The range is used when
• you have ordinal data or
• you are presenting your results to people with little or no knowledge of
statistics
• The range is rarely used in scientific work as it is fairly insensitive
• It depends on only two scores in the set of data, XL and XS
• Two very different sets of data can have the same range:
1 1 1 1 9 vs 1 3 5 7 9
The Semi-Interquartile Range
• The semi-interquartile range (or SIR) is defined as the difference of
the first and third quartiles divided by two
• The first quartile is the 25th percentile
• The third quartile is the 75th percentile
• SIR = (Q3 - Q1) / 2
The Semi-Interquartile Range Example
• What is the SIR for the data to the right? 2
• 25 % of the scores are below 5 4
 5 = 25th %tile
• 5 is the first quartile 6
• 25 % of the scores are above 25 8
• 25 is the third quartile 10
• SIR = (Q3 - Q1) / 2 = (25 - 5) / 2 = 10 12
• When To Use the SIR 14
The SIR is often used with skewed data as 20
 25 = 75th %tile
it is insensitive to the extreme scores. 30
60
Variance
• Variance is defined as the average of the square deviations:  X   2

2 
What Does the Variance Formula Mean? N
• First, it says to subtract the mean from each of the scores
• This difference is called a deviate or a deviation score
• The deviate tells us how far a given score is from the typical, or average, score
• Thus, the deviate is a measure of dispersion for a given score
Standard Deviation
• When the deviate scores are squared in variance, their unit of
measure is squared as well
• E.g. If people’s weights are measured in pounds, then the variance of the
weights would be expressed in pounds2 (or squared pounds)
• Since squared units of measure are often awkward to deal with, the
square root of variance is often used instead
• The standard deviation is the square root of variance

• Standard deviation = variance

• Variance = standard deviation2
Computational Formula

• When calculating variance, it is often easier to use a computational

formula which is algebraically equivalent to the definitional formula:

 X
2

X  
2

  
2

N X

2
 
N N
• 2 is the population variance, X is a score,  is the population mean,
and N is the number of scores
Computational Formula Example
X X2 X- (X-)2
9 81 2 4
8 64 1 1
6 36 -1 1
5 25 -2 4
8 64 1 1
6 36 -1 1
 = 42  = 306 =0  = 12
Computational Formula Example Cont..
 X  X 
2 2

X 
2 2
 
N N

2

N 12

2
6
306  42 2
 6
6
306  294

6
12

6
2
Variance of a Sample
• Because the sample mean is not a perfect estimate of the population
mean, the formula for the variance of a sample is slightly different
from the formula for the variance of a population:

s
2

 X X 2

N 1
• s2 is the sample variance, X is a score, X is the sample mean, and N is
the number of scores
Measure of Skew
• Skew is a measure of symmetry in the distribution of scores
Normal (skew = 0)

Positive Skew
Negative Skew
Measure of Skew Cont..
• The following formula can be used to determine skew:

 
 X X
3

3 N
s 
 X  X 
2

N
Measure of Skew Cont..
• If s3 < 0, then the distribution has a negative skew
• If s3 > 0 then the distribution has a positive skew
• If s3 = 0 then the distribution is symmetrical
• The more different s3 is from 0, the greater the skew in the
distribution
Statistical data and representation of data
• Statistical data refers to the information collected through various methods, such
as surveys, experiments, or observations. It can be numerical or categorical and
is often used to analyze and make inferences about a population or a
phenomenon. Representing data visually is a crucial aspect of statistical analysis
as it helps in better understanding and communication. Here are some common
types of statistical data and methods of representation:
• Types of Statistical Data:
• Numerical Data (Quantitative): Consists of numerical values and can be
further classified as discrete or continuous. Examples include age, height,
income, and temperature.
• Categorical Data (Qualitative): Represents categories or labels. Examples
include gender, color, and types of cars.
Statistical data and representation of data
Cont..
• Methods of Representation:
• 1. Tables:
• Simple and effective way to organize and present data.
• Useful for small datasets and presenting categorical data.
• 2. Charts and Graphs:
• Bar Charts: Suitable for representing categorical data. Bars are used to represent the
frequency or proportion of each category.
• Histograms: Similar to bar charts but used for displaying the distribution of continuous data.
The bars are contiguous.
• Pie Charts: Represents parts of a whole. Useful for displaying the composition of a
categorical variable.
• 3. Line Charts:
• Useful for showing trends and patterns over time. Often used with time-series data.
Statistical data and representation of data
Cont..
• 4. Scatter Plots:
• Used to visualize the relationship between two numerical variables. Each point represents an
observation.
• 5. Box Plots (Box-and-Whisker Plots):
• Displays the distribution of a dataset and highlights the central tendency, spread, and outliers.
• 6. Frequency Distributions:
• Tables or graphs that show the frequency of different values or ranges in a dataset.
• 7. Measures of Central Tendency:
• Mean: Average value of a dataset.
• Median: Middle value when the data is arranged in ascending order.
• Mode: Most frequently occurring value.
• 8. Measures of Dispersion:
• Range: The difference between the maximum and minimum values.
• Variance and Standard Deviation: Indicate the spread or dispersion of data around the mean.
Statistical data and representation of data
Cont..
9. Correlation Coefficient:
• Measures the strength and direction of the linear relationship between two numerical
variables.
10. Regression Analysis:
• Examines the relationship between one dependent variable and one or more independent
variables.
• These methods of representation help in summarizing, analyzing, and
interpreting data for better decision-making and communication.
Choosing the appropriate method depends on the nature of the data
and the insights you want to convey.

Unit 1 - Business Statistics & Analytics
No ratings yet
Unit 1 - Business Statistics & Analytics
25 pages
Statistical Analysis - Descriptive Stat
No ratings yet
Statistical Analysis - Descriptive Stat
6 pages
Statistics
No ratings yet
Statistics
21 pages
2 - Introduction To Statistics
No ratings yet
2 - Introduction To Statistics
97 pages
CH 2 Lecture Notes
No ratings yet
CH 2 Lecture Notes
12 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
f592b059 1643454320549
No ratings yet
f592b059 1643454320549
39 pages
Jerome Statistics
No ratings yet
Jerome Statistics
12 pages
Biostatics Course
No ratings yet
Biostatics Course
29 pages
Statistics Notes
No ratings yet
Statistics Notes
16 pages
Psychology Project
No ratings yet
Psychology Project
14 pages
Descriptive Statistic
No ratings yet
Descriptive Statistic
37 pages
Day 3 Educational Statistics
No ratings yet
Day 3 Educational Statistics
37 pages
Data Management
No ratings yet
Data Management
48 pages
ISM Session 1-8+webinar1,2 Merged
No ratings yet
ISM Session 1-8+webinar1,2 Merged
718 pages
Intro To Statistics - Descriptive Statistics and NPC - 20250225 - 171911 - 0000
No ratings yet
Intro To Statistics - Descriptive Statistics and NPC - 20250225 - 171911 - 0000
23 pages
Introduction To Statistics Lecture 7
No ratings yet
Introduction To Statistics Lecture 7
32 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
2nd Unit - Statistics
No ratings yet
2nd Unit - Statistics
15 pages
STATISTICS
No ratings yet
STATISTICS
98 pages
Biostatistics (Descriptive Statistics)
No ratings yet
Biostatistics (Descriptive Statistics)
30 pages
Module I. Basic Calculations. Average, Standard Deviation by Excel
No ratings yet
Module I. Basic Calculations. Average, Standard Deviation by Excel
48 pages
Unit 3 Measure of Central Location
No ratings yet
Unit 3 Measure of Central Location
29 pages
MATM111
No ratings yet
MATM111
8 pages
Week One: Introduction To Quantitative Methods MBA 2013
No ratings yet
Week One: Introduction To Quantitative Methods MBA 2013
49 pages
Unit-3 DS Students
No ratings yet
Unit-3 DS Students
35 pages
DDDDDD 2
No ratings yet
DDDDDD 2
5 pages
Descriptive Statistics PDF
100% (1)
Descriptive Statistics PDF
40 pages
Assignment
No ratings yet
Assignment
23 pages
Assignment
No ratings yet
Assignment
30 pages
Basic Concepts of Statistics
No ratings yet
Basic Concepts of Statistics
43 pages
Reviewer Part 1
No ratings yet
Reviewer Part 1
9 pages
Ch3 Numerically Summarizing Data
No ratings yet
Ch3 Numerically Summarizing Data
35 pages
8614.educational Statitics Unit 4
No ratings yet
8614.educational Statitics Unit 4
34 pages
SSC CGL Tier 2 Statistics - Last Minute Study Notes: Measures of Central Tendency
No ratings yet
SSC CGL Tier 2 Statistics - Last Minute Study Notes: Measures of Central Tendency
10 pages
Unit 4 & 5 8614
No ratings yet
Unit 4 & 5 8614
58 pages
Unit 4 Descriptive Statistics
No ratings yet
Unit 4 Descriptive Statistics
8 pages
Statistics
No ratings yet
Statistics
13 pages
Chapter 01
No ratings yet
Chapter 01
56 pages
Chapter1 Statistics
No ratings yet
Chapter1 Statistics
17 pages
IMS 504-Week 4&5 New
No ratings yet
IMS 504-Week 4&5 New
40 pages
Introduction To Descriptive Statistics 2014
67% (3)
Introduction To Descriptive Statistics 2014
72 pages
Statistics, Statistical Modelling & Data Analytics
No ratings yet
Statistics, Statistical Modelling & Data Analytics
68 pages
Chapter 2 bsc TY statistical data analysis
No ratings yet
Chapter 2 bsc TY statistical data analysis
124 pages
Statistics
No ratings yet
Statistics
152 pages
Descriptive Stat
No ratings yet
Descriptive Stat
13 pages
Statistics - Imp Points
No ratings yet
Statistics - Imp Points
6 pages
Bioepi Lesson 6. Descriptive Statistics
No ratings yet
Bioepi Lesson 6. Descriptive Statistics
38 pages
Basic Statistics
No ratings yet
Basic Statistics
24 pages
Statistics 1
No ratings yet
Statistics 1
10 pages
Session 1 ISM May 2024
No ratings yet
Session 1 ISM May 2024
59 pages
Interpreting Test Score: Online Workshop 8602 Aiou
100% (1)
Interpreting Test Score: Online Workshop 8602 Aiou
39 pages
Ai - Ssmda
No ratings yet
Ai - Ssmda
142 pages
Lecture 1
No ratings yet
Lecture 1
32 pages
Statistics - Compendium - DMS IIT DELHI - 2025
No ratings yet
Statistics - Compendium - DMS IIT DELHI - 2025
18 pages
Measures of Central Tendency Position and Dispersion 1.Pptx 20241015 145631 0000
No ratings yet
Measures of Central Tendency Position and Dispersion 1.Pptx 20241015 145631 0000
44 pages
Discriptive Statistics
No ratings yet
Discriptive Statistics
23 pages

Presentation 4

Uploaded by

Presentation 4

Uploaded by

Statistics

Basic terminologies and types of statistics

• Mathematical statistics is the application of Mathematics to Statistics,

• Mathematical techniques used for different analytics include

• Basically, there are two types of statistics.

• In the case of descriptive statistics, the data or collection of data is

• The data is summarised and explained in descriptive statistics. The

• We attempt to interpret the meaning of descriptive statistics using

• The upper distribution has more 125

• Standard deviation = variance

• When calculating variance, it is often easier to use a computational

You might also like