0% found this document useful (0 votes)
22 views

BM The Basic Statistical Data

The document discusses basic statistical data concepts including data collection methods, frequency distributions, measures of central tendency, and measures of variability. It provides examples of calculating the mean, median, mode, variance, and standard deviation of data sets.

Uploaded by

Kievs Gts
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

BM The Basic Statistical Data

The document discusses basic statistical data concepts including data collection methods, frequency distributions, measures of central tendency, and measures of variability. It provides examples of calculating the mean, median, mode, variance, and standard deviation of data sets.

Uploaded by

Kievs Gts
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

The Basic Statistical Data

 Anything that contains information


 Expressions are not data they are vowels
 It could per categorical (pertaining to colors, sizes, etc.) or numerical
(amount of…).

 Statistics involves collecting, organizing, summarizing and analyzing

 Ways to collect data- Survey, conducting research, interviews, browsing


books and testing stuff
 Gathering information
 When data are collected, they are called raw data
 Not yet finished, processed or analyze (fresh from primary source)

 Frequency Distribution – shows the amount of data per category


 In order to organize the data we tally it in a table

The data show the way 30 employees get to work each day
A = Automobiles
A A B A W T
B = Bus
B A A A B A
W = Walk
T A B B A W
T = Train
A T W B A A
*First thing to do is to tally
W A B W T A

CLASS TALLY FREQUENCY


AUTOMOBILE (A) IIIII – IIIII - IIII 14
BUS (B) IIIII - II 7
WALK (W) IIIII 5
TRAIN (T) IIII 4
TOTAL 30
37 32 22 45 65
54 46 48 24 31  When the range of data
25 37 28 26 39 values is too large, the data
47 48 42 36 40 can be organized into
53 55 60 45 32 classes that consist of more
28 50 36 34 47 than one number (range); for
39 25 37 33 39 example, 10 -14, 15 – 19,
etc.
30 40 58 24 24
33 31 29 54 38

In this situation, there are guidelines that can be used when setting up the
classes. They are.
1. Use 5 – 15 classes
2. Keep each class of the same width
3. Do not leave out any class, even if the frequency of the class is Zero
4. Make sure that there are enough classes for all the data
5. Do not overlap the classes

Step 1: Determine the classes. Get the difference between the smallest and
largest value. Then divide it according to the number of classes you want
65 – 22 = 43/6 = 7…. = 8
Step 2: Get the lowest data value
Lowest: 22
Step 3: Add 8 (quotient) and continue until you have 6 classes
22 + 8 = 30
54 + 8 = 62
30 + 8 = 38
62 + 8 = 70
38 + 8 = 46
46 + 8 = 54
Step 4: Subtract 1 from each value to get the upper limit
30 – 1 = 29
62 – 1 = 61
38 – 1 = 37
70 – 1 = 69
46 – 1 = 45
54 – 1 = 53
CLASS TALLY FREQUENCY
22 – 29 IIIII – IIIII 10
30 – 37 IIIII – IIIII – III 13
38 – 45 IIIII – IIII 9
46 – 53 IIIII – II 7
54 – 61 IIIII 5
62 – 69 I 1
Total 45

 A diagram consisting of rectangles whose are is proportional to the


frequency of a variable and whose width is equal to the class interval
(lexical definition)

14

12

10
Frequency

0
22 - 29 30 - 37 38 - 45 46 - 53 54 - 61 62 - 69
Ages

A survey of 42 students show how many cellphones they own. Make a frequency
distribution and a histogram for the data
1 2 5 4 1 1
2 4 2 2 3 4
5 5 4 3 4 5
2 3 1 4 5 1
3 4 5 4 3 3
4 2 4 3 1 1
2 3 3 4 3 2
CLASS TALLY FREQUENCY
1 IIIII - II 7
2 IIIII – III 8
3 IIIII – IIIII 10
4 IIIII – IIIII – I 11
5 IIIII - I 6
Total 42

12

10

8
Frequency

0
1 2 3 4 5
Number of Cellphones

 Common
 Something in the middle

Measures of Average

 Add all values and divide by the number of data values

If the quarterly grades of the student for the school year are 84, 88, 87 and
89. Find the mean.
87

 Weighted average / mean– this computed by multiplying the data values


by their corresponding weights, adding all the products and dividing it
by the sum of the weight
 To compute scores, grades and salaries

Find the weighted average / mean


Subjects Units Grade
Chemistry 3 87
Philosophy 2 88
Business Math 5 90
English 3 91
Total 13 89

87 x 3 = 261
88 x 2 = 176
90 x 5 = 450
91 x 3 = 270
261 + 176 + 450 + 270 = 289.25 / 13 = 89
Find the weighted mean
Subjects Units Grade
Chemistry 3 89
Philosophy 2 95
Business Math 5 92
English 3 90
Total 13 ??

89 x 3 = 267
95 x 2 = 190
92 x 5 = 460
90 x 3 = 270
267 + 190 + 460 + 270 / 13 = 91.3

 The middle of the data when it is arranged in order

Arrange data in order and get the middle data value. Find the median of 8, 10, 6,
10, 12, 15, 5,
8
*If it’s even number get the middle 2 numbers and get the average
 That which occurs most often

Find the mode of 12, 18, 15, 16, 15, 14, and 6
15

The number of movies a video store rented during a 7 – day period is shown.
Find the mean, median and mode for the data:
156, 182, 147, 159, 165, 171, 159
147, 156, 159, 159, 165, 171, 182
Mean – 162.71
Median – 159
Mode–159

The two data can have the same mean and still be different. Consider the two
data sets:
Set A: 5, 10, 15, 20, 25
Set B: 13, 14, 15, 16, 17
Different: Range, standard deviation
 For this reason, statisticians also use three common measures of variability
to describe the data. They are the range, variance, and standard
deviation. Moreover, measures of variability are also called as
dispersion.

 Range is the difference between the smallest and the largest data value
 Range is a rough indication of variability, that is why statistician also use
variance and standard deviation

 Is the average of the squared differences from the mean

 Is just the square root of variance


Find the variance and standard deviation for the data below.
5, 10, 15, 20, 25
Step 1: Find the mean
(5 + 10 + 15 + 20 + 25) / 5 = 15
Step 2: Subtract the mean from each data value
5 – 15 = -10 20 – 15 = 5
10 – 15 = -5 25 – 15 = 10
15 – 15 = 0
Step 3: Square the answers and find the average
[(10)2 + (5)2 + (0)2 + (5)2 + (10)2] / 5 = 50
Variance:50
Standard Deviation:√50 = 7.07

Find the variance and standard deviation for following data listed below
1) 10, 11, 12, 13, 16
Variance:(10 + 11 + 12 + 13 + 16) / 5 = 12.4
10 – 12.4 = -2.4 13 – 12.4 = 0.6
11 – 12.4 = -1.4 16 – 12.4 = 3.6
12 – 12.4 = -0.4
[(-2.4)2 + (-1.4)2 + (-0.4)2 + (0.6)2 + (3.6)2] / 5 = 4.24
Standard Deviation: √4.24 = 2.06

2) 16, 19, 18, 36, 50


Variance:(16 + 19 + 18 + 36 + 50) / 5 = 27.8
16 – 27.8 = -11.8 36 – 27.8 = 8.2
19 – 27.8 = -8.8 50 – 27.8 = 22.2
18 – 27.8 = -9.8
[(-11.8)2 + (-8.8)2 + (-9.8)2 + (8.2)2 + (22.2)2] / 5 = 174.56
Standard Deviation:√174.56 = 13.12

3) 6, 22, 26, 40
Variance:(6 + 22 + 30 + 40) / 4 = 24.5
6 – 24.5 = -18.5 30 – 24.5 = 5.5
22 – 24.5 = -2.5 40 – 24.5 = 15.5
[(-18.5) + (-2.5) + (5.5)2 + (15.5)2] / 4 = 154.75
2 2

Standard Deviation:√154.75 = 12.44


Get the mean of the data 15, 34.4, 56, 35
(15 + 34.4 + 56 + 35) / 4 = 35.1

 What do we measure in Statistics?Data


 What is data?It contains information
 What is not a data?Emotions(stuff that doesn’t contain any information)
 Is love a data?No, because love is not measurable
 How do we get the average?Add all the values then divide it with the
number of values
 How do we get the mean?Add all the values then divide it with the number
of data values
 How do we get the median?First you rearrange the data then get the
middle value but if the number of data values are even we get the 2
middle values then divide it into 2
 How do we get the mode?Get the number that is occurred often
 What is variability in Statistics?Dispersion / widespread
 What does range measure?Measures the dispersion and widespread
 How do we solve for the variance?average of the squared differences
from the mean

Solve the variance of the data: 9, 10, 13, 15


(9 + 10 + 13 + 15) / 4 = 11.75
9 – 11.75 = -2.75 13 – 11.75 = 1.25
10 – 11.75 = -1.75 15 – 11.75 = 3.25
[(-2.75)2 + (-1.75)2 + (1.25)2 + (3.25)2] / 4 = 5.6875 = 5.69
What is the Standard Deviation?√5.69= 2.39

Solve for the variance of the data: 13, 60, 63, 65


(13 + 60 + 63 + 65) / 4 = 50.25
13 – 50.25 = -37.25 63 – 50.25 = 12.75
60 – 50.25 = 9.75 65 – 50.25 = 14.75
[(-37.25) + (9.75) + (12.75)2 + (14.75)2] / 4 = 465.6875 = 465.69
2 2

What is the Standard Deviation?√465.69= 21.58


Business Data Presentation
 Written Reports
 Verbal Presentations
 Advertisements
 Budgets
 Issues
 Growth
 Results and analytics

 Bar graphs
 Pareto graph
 Pie graph 10
 Time series graph
8
 Scatter Diagram
People

6
4
2
 Horizontal bar graph
0
 Vertical bar graph
Black Blue Red Pink
 Pareto graph Colors

 It consists of bars
 Bars of the same width
 There are spaces between them
 Use a scale of 0 – 9 units

 Uses a circle
 Divide into section proportional to the data (representing parts of a
whole)
 Use a protractor (if manual)
 Get the percentage of each category / class from the whole, then
multiply to 360 to get the angle
 Label properly and make it presentable

Restaurants preferred by Students


KFC
Mcdo
Jollibee
Shakeys

 When data are collected over a period of time (hours, days, weeks, etc.),
they can be analyzed using a time series graph
 The scale along x – axis represents the time
 Y – axis represents data values
 Data values are connected with broken line segments

The data show the number of sports – talk radio station in the United States
over the last several years. Draw a time series and suggest any trends that
might appear
Year 1995 1997 1999 2001 2003
Number 146 224 258 342 427

Number of Sports - Talk Radio Station in the


United States over the last several years
400
Number of Sports

300

200

100

0
1995 1997 1999 2001 2003
Years

 Correlation (Indicates the relationship between two sets of data)


 Use scatter diagram to compare data
 Positive (one increase other increase)
 More scattered weaker correlation
 A Scatter plot / diagram uses cartesian coordinates to display two data
values for two values
 Purpose: This diagram is used to identify the type of relationship (if any)
between two quantitative variables
 The data are displayed as a collection of points, each having the
value of one variable determining the position on the X – axis and the
value of the other variable determining the position on the Y – axis

The data show some of the number of concert shows of musical groups and
the gross incomes in millions of dollars the groups earned from these tours.
Construct and analyze a scatter plot for the data:

Number 63 54 88 125 96 72
Gross 134 83 76 118 108 106
Income

Concert Shows and Income


150
Weak positive relationship
Gross Income

100

50

0
0 50 100 150
Number
The data show the tuition in thousands of pesos and number of full – time
faculty for eight selected colleges in United States. Draw a scatter diagram
and determine the typical relationship.

Tuition 12k 23k 16k 8k 22k 19k 14k 22k


No. of 14 188 177 85 141 92 58 206
Faculty

Colleges in the United States


and Tuition Weak Positive Relationship
25000
Price of Tuition

20000
15000
10000
5000
0
0 50 100 150 200 250
Number of Faculty

 Type of chart consisting of a vertical bar graphs and a line graph, where
the data values are represented in descending order by bars, and the
cumulative total percentage is represented by a line

Purpose of Pareto Graph


 The purpose of the Pareto chart / graph is to highlight the most important
among a set of factors
 This graph is mostly used in quality control, for it could show the most
common sources of defects, the most influential or significant factor in a
phenomenon and the gravity / weight of one factor to the whole

Steps to make a Pareto Graph


 Draw the axes and scale the same way as is done for the vertical bar
graph, BUT! Take note:
 The left – side vertical axis of the pareto chart is labeled
Frequency (the number of counts for each category)
 The right – side vertical axis of the pareto chart is the cumulative
percentage
 The horizontal axis of the pareto chart is labeled with the group
names of your response variables
 The bars should start with the largest data value and descend to the
smallest data value
 The bars should touch each other
 Calculate the percentage for each category
 Place a dot above the first bar indicating the percentage of the first
category
 Add the subtotals for the first and second categories, and place a dot
above the second bar indicating that sum
 To that sum add the subtotal for the third category, and place a dot
above the third bar for that new sum
 Continue the process for all the bars
 Connect the dots, starting at the top of the first bar. The last dot should
reach 100 percent on the right scale

The following data shows the number of tons of trash recycled in a certain city
for a given week. Draw a pareto chart for the data:

Type Amount
Paper 635
Aluminum 423
Glass 187
Plastic 98

700
600
Amount of Trash

500
400
300
200
100
0
Types of Trash
The following data show the number of registered motorcycles in certain
municipalities for a specific year. Draw a pareto chart for the data

Municipality Number
West Irwin 54
Cedar Creek 32
Keystone 41
Mount Newton 36
South Penn. 18

60
Number of Motorcycles

50
40
30
20
10
0
Municipalities

You might also like