0% found this document useful (0 votes)

31 views

Week 1,2 Instructor

1. The bad loan rate is highest for borrowers aged 42-45 years at 2.4% and lowest for borrowers aged <21 years at 1.1%. 2. In general, the bad loan rate increases with age until peaking between 42-45 years of age and then decreases for older borrowers. 3. While the number of loans is largest for borrowers aged 30-39 years, this age group has a relatively low bad loan rate of around 2%.

Uploaded by

kins

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views

Week 1,2 Instructor

Uploaded by

kins

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 73

BUS 232

Data and Decisions |

(Business Statistics)
Instructor: Negar Ganjouhaghighi
What is Statistics?

A science dealing with the collection, analysis, interpretation, and

presentation of numerical data

Collect Analyze Interpret Present

Data Data Data Findings
Data in Business Disciplines Examples

Revenue Bond rates Salaries Forecast sale CPU time

# of “hits” on Time spent on # of Amount of Quantity of raw

the net net per day productions/day inventory materials

Storage Age of Foreign

Market share taxes
Capacity employees exchange rate
Population vs Sample

Population: Sample:
a collection of a portion of the
persons, objects, whole, and if
or items of properly taken,
interest representative of the
whole
• Random sample of
• All automobiles
4k automobiles
• All Ford Escape
• Random sample of
crossover vehicles
100 Ford Escape
produced in 2021 Census: When analysts gather data from the crossover vehicles
whole population for a given measurement of produced in 2021
interest. Example: Canadian Census
2 main branches of statistics

Descriptive Statistics
• using data gathered on a group to
describe or reach conclusions about
that same group

Inferential Statistics
• gathers data from a sample and uses
the statistics generated to reach
conclusions about the population
from which the sample was taken
Descriptive vs Inferential Statistics: an example

• Heights of a random sample of 10 male

DS basketball players are measured
• Mean/Median/Mode/Variance/Range…

• Test and compare the mean height of the

above sample with the mean height of the
IS
whole population to see if basketball players
are larger than the average male population
Inferential Statistics

Widely used in
Aka. Inductive
pharmaceutical
Statistics
research

Allows studying a
Starts from a
wide range of
hypothesis and makes
phenomena without
a statement about the
having to conduct a
population
census
Parameter vs Statistics

Parameter Statistics

• Descriptive • Descriptive
measure of the measure of a
population sample
• Greek Letters • Roman letter
• Population mean • Sample mean ()
() • Sample Variance
• Population SD () ()
Parameter vs Statistics

Example: determine the average

height of the students in this class
• The population?
• The parameter?

Inferences about
Randomly select 10 students
parameters are made
under uncertainty
Statistical Inference is inference about a … ?

LEFT RIGHT

Population Sample
Variables, Data, and Data Measurement

Most business statistics studies contain Variables, Measurement, and Data.

Variable Measurement
D
• a characteristic of any entity
being studied that is capable of
• a standard process used to
assign numbers to particular
AT
taking on different values
• labour productivity
attributes or characteristics of a
variable
A
• Products produced per hour
Data Case studies with Big Results

GOOGLE: Netflix:
Working with the U.S. Centers for collects data from its users
Disease Control, tracks when users including Viewing time, platform
are inputting search terms related to searches for keywords, Metadata
flu topics, to help predict which related to content abandonment,
regions may experience outbreaks. such as content pause time, rewind,
rewatched. Using the data they
predict what a viewer is likely to
watch and give a personalized
watchlist to a user. .
Levels of Data
Known,
Categories Meaningful
Ranks Equal
zero
intervals
Height, Mass,
time
Ratio
% change in
employment Interval
Patient CTAS in
ED
Ordinal
Sex, Religion, Metric/Quantitative Data
Student ID
Nominal
Nonmetric/Qualitative Data
A researcher collects demographic data from her
participants. She asks participants for their city of
birth.
Which level of measurement is this?
LEFT RIGHT

Nominal Interval
She then asks participants to report the number of
hours they spent exercising in the past week.
Which level of measurement is this?
LEFT RIGHT

Interval Ratio
Big Data

Big data: a collection of

large and complex datasets
from different sources that
are difficult to process using
traditional data management
and processing applications.
So:
BUSINESS ANALYTICS
Business Analytics

Application of
processes and
techniques that
transform raw data into
meaningful
information to improve
decision making

Source
Canadian Occupational Projectio
Business Analytics

Descriptive Predictive Prescriptive

Analytics Analytics Analytics
• Simplest and most • Finds relationships in • Still in early stages of
commonly used the data that are not development
• Describe what is found in the first step • Takes uncertainty into
happening in business • Make predictions about account
• Data mining, data future • Recommend ways to
visualisation, statistics, • Regression, Time-series, mitigate risk
… forecasting, Simulation, • Aims to optimize the
ML,… performance of a system
Case Study 1 Total # # of bad # of good
of loans loans loans
<21 9 2 7
21-24 310 14 296
Assume that you are the chief risk officer for a 24-27 511 20 491

bank that has disbursed 60816 auto loans in the 27-30 4000 172 3828

quarter between April-June 2021. 30-33 4568 169 4399

33-36 5698 188 5510
According to data, you have had total of 1524 36-39 8209 197 8012

bad loans or rate of 2.5% 39-42 8117 211 7906

42-45 9000 216 8784
You want to analyze the bad rate across several 45-48 7600 152 7448
individual variables. 48-51 6000 84 5916
Based on your experience, the borrower’s age is 51-54 4000 64 3936

a critical factor. 54-57 2000 26 1974

57-60 788 9 779
>60 6 0 6
Case Study 1

1. The distribution of loans across

age groups is a reasonably
smooth normally distributed
curve
2. The max bad loans are in the
age bucket 42-45 years (doesn’t
necessarily mean the risk is also
higher)
3. Not enough data for the fringe
buckets (<21 and >60 years)
Case Study 1 Age
Total # # of bad # of good % Bad % Good
of loans loans loans loans loans
<21 9 2 7 22.2% 77.8%
21-24 310 14 296 4.5% 95.5%
24-27 511 20 491 3.9% 96.1%
Normalized Plot 27-30 4000 172 3828
30-33 4568
Conclusion:
169
4399 3.7%
4.3%
96.3%
95.7%

33-36 As
5698the borrowers
188 5510 are3.3%
getting96.7%
36-39
39-42
older,
8209
8117
they
197
211
are less
8012
7906
likely
2.4%
2.6%
to97.6%
97.4%
42-45 9000 default
216 on8784
their loans
2.4% 97.6%
45-48 7600 152 7448 2.0% 98.0%
48-51 6000 84 5916 1.4% 98.6%
51-54 4000 64 3936 1.6% 98.4%
54-57 2000 26 1974 1.3% 98.7%
57-60 788 9 779 1.1% 98.9%
>60 6 0 6 0.0% 100.0%
Visualizing Data With Charts and
Graphs
Data visualization is useful for:

Exploring Detecting
Data Cleaning
data structure outliers

Identifying
Spotting local Presenting
trends and
patterns Results
clusters
Frequency Distributions

Ungrouped Data
• Raw data, have not been summarized in any way

Grouped Data
• Data that have been organized into a frequency
distribution
Frequency Distributions
Frequency Distributions
Raw data refers to which type of data?

LEFT RIGHT

Grouped Ungrouped
Let’s solve Problem 2.1…
Let’s solve Problem 2.3…
Quantitative Data Graphs

Histograms Frequency Ogive Stem and leaf

Polygons
Useful toll for
differentiating the Similar to Histogram but Cumulative frequency Separating the digits for
frequencies of class each class frequency is polygon each number of the data
intervals plotted as a dot at the
Running totals into a stem and a leaf
class midpoint
Finding the outliers
Let’s solve Problem 2.10…
Qualitative Data Graphs

Pie Charts Bar Charts Pareto Charts

Shows the relative A vertical bar chart that

magnitude of parts to a Easier to see the displays the most
whole difference between common types of
similar categories defects, ranked in order
Less accurate of occurrence
Scatter plot data

Temperature °C Ice Cream Sales

• A 2-dimensional graph plot of pairs of 14.2° $215
points from 2 numerical variables 16.4° $325
• Often Used to examine possible 11.9° $185
relationships between 2 variables 15.2° $332
18.5° $406
22.1° $522
19.4° $412
25.1° $614
23.4° $544
18.1° $421
22.6° $445
17.2° $408
Visualizing Time-Series Data

• Time series data: data gathered on a particular characteristic over

a period of time at regular intervals (hours/weeks/years…)
• Visualizing with a line chart: see the trend
Visualizing
Time-Series
Data with a
line chart

Intervals:
one year
Descriptive Statistics

• Data visualization: general observations about the shape and

spread of the data
• Statistics: more complete understanding of the data
• Measure of central tendency
• Measures of variability
• Measures of shape
Measures of Central Tendency

• Yield information about the centre, or middle part, of a group of

numbers.
• Yield such information as the average, the middle point, and the
most frequently occurring point
• Do not focus on the span of the data
• The common measure of central tendency:

Mean Median Mode Percentile Quartile

Mean

• The arithmetic mean: the average of a group of numbers

Population Mean: Sample Mean:

• Example: Salaries of data analysts in top 6 companies in Vancouver, BC:

Company Annual Salary Company Annual Salary
Si Systems $131,508 TransLink $99,181
DISYS $107,808 CRD $97,160
UBC $106,272 MSi Corp $94,944

• Mean salary of a data analyst:

Median

• The median: the middle value in an ordered array of numbers

Odd number of
terms: find the
Sort the numbers middle number
from smallest to
largest Even number of
terms: find the
average of the middle
2 terms
Median -Example

Date Mean Daily • Find the median temperature of the day August
Temperature (C)
1st over the last 10 years.
1 Aug 2022 20.6
1 Aug 2021 22 • Sort the data:
1 Aug 2020 19.2
17.8,17.9,18.8,19.2,19.9,20.2,20.5,20.6,20.6,22
1 Aug 2019 20.5
1 Aug 2018 18.8 • The number of terms: 10 (Even number)
1 Aug 2017 19.9 • The median will be the average of 2 middle
1 Aug 2016 17.8
terms:19.9 and 20.2
1 Aug 2015 20.6
1 Aug 2014 20.2 • (19.9+20.2)/2=20.05
1 Aug 2013 17.9 •
Mode

• The mode: the most frequently occurring value in a set of data

• Sorting the data helps to locate the mode
• Example: Inflation rate, Canada, 1995-2022
1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
2.1 1.6 1.6 1 1.7 2.7 2.5 2.3 2.8 1.9 2.2 2.2 2.1 2.4
2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
0.3 1.8 2.9 1.5 0.9 1.9 1.1 1.4 1.6 2.3 2.0 0.7 3.4 5.6
• 0.3 0.7 0.9 1 1.1 1.4 1.5 1.6 1.6 1.6 1.7 1.8 1.9 1.9 2 2.1 2.1 2.2 2.2 2.3 2.3 2.4 2.5 2.7 2.8 2.9 3.4 5.6

• The mode: 1.6 %

What is the median of the following dataset?
5 3 6 0 7
LEFT RIGHT

5.5 5
Percentiles

• Measures of central tendency that divide a group of data into 100 parts.
• There are 99 percentiles (and not 100) as it takes 99 dividers to separate a
group of data into 100 parts
• Example: the 87th percentile value:
• 87% of the data are below that and
• no more than 13% of the data are above the value
• Widely used in reporting test results
Percentiles

• Steps in determining the location of a percentile:

• Sort the numbers in an ascending order
• Calculate the percentile location (i) by:

• P=the percentile of interest

• i=percentile location
• N=number in the data set
• a) if i is a whole number: the Pth percentile is the average of the value
at the ith and (i+1) locations
• b) if I is not a whole number: the Pth percentile value is located at the
whole number part of i+1
Percentiles - example

• Determine the 40th percentile of the following 9 numbers:

14,12,19,23,5,13,28,17,2

• Step1: Sort the data from smallest to largest

• 2,5,12,13,14,17,19,23,28
• Step2: calculate the percentile location: 40% * 9 =3.6
• Since the location is not a whole number, we find the closest integer
number that is greater than 3.6  4. So the 40% percentile is located at
the 4th value: 13.
• So, 13 is the 40th percentile.
Percentiles - example

• Determine the 50th percentile of the following 8 numbers:

14,12,19,23,5,13,28,17

• Step1: Sort the data from smallest to largest

• 5,12,13,14,17,19,23,28
• Step2: calculate the percentile location: 50% * 8 =4
• Since the location is a whole number, we find the average value of the
4th and 5th number: (14+17)/2=15.5
• So, 15.5 is the 50th percentile.
Quartiles

• Measures of central tendency that divide a group of data into 4 subgroups

or parts
Quartiles - Example

• Suppose we want to determine the values of , , and for the following

numbers: 2, 5, 6, 7, 10, 22, 13, 14, 16, 65, 45, 12.
• Step 1: Put the numbers in order: 2, 5, 6, 7, 10, 12 13, 14, 16, 22, 45, 65
• Count how many numbers there are in your set: 12
• 12/4=3: 3 numbers in each quartile:
• 2, 5, 6, | 7, 10, 12 | 13, 14, 16, | 22, 45, 65
• : , so the first quartile value is 6.5
12.5
• ?
19
Let’s solve Problem 3.7…
Measures of Variability

• Describe the spread or the

dispersion of a set of data
• Used with measures of central
tendency provides more
accurate information about the
data
• 7 main measures of variability:

Interquartile Mean absolute Standard Coefficient of

Range variance z scores
range deviation deviation variation
Range and Interquartile Range

• Range: the difference between the largest and the smallest values of a
data set.
• Advantage: ease of computation
• Disadvantage: affected by extreme values
• Interquartile Range (IQR): the range of values between the first and
third quartiles.
• It is the range of the middle 50% of the data
• Determined by
Interquartile Range - Example

• Canada Imports- Top Categories:

Category Value (Billion USD)
Cars 28 • is the 3rd value from the bottom: 8
Car parts 20
Trucks 15
Crude oil 14 • is the 8th value from the bottom: 15
Processed petroleum oil 14 • The IQR is
phones IQR 11 • The middle 50% of the top 10 imports
Computers 9 to Canada spans a range of $7 billion
Medications 8 (USD)
Turbo jets 6
Gold 6
Deviation from the mean

Apple annual revenue 2015-2021 ($bn)

Subtracting the mean from each data
YEAR Revenue
Mean=268.3 2015 233.6
2016 215.4
Sum of the deviations from the 2017 229
Arithmetic Mean is Always 2018 265.4
Zero 2019 260.1
2020 274.3
2021 365.8
Apple annual revenue 2015-2021 ($bn)
YEAR Revenue Deviation from the mean
2015 233.6 268.3-233.6=29.8
2016 215.4 268.3-215.4=48.0
2017 229 268.3-229=34.4
2018 265.4 268.3-265.4=-2.0
2019 260.1 268.3-260.1=20.2
2020 274.3 268.3-274=-10.9
2021 365.8 268.3-366=-102.4
Mean absolute Deviation

Apple annual revenue 2015-2021 ($bn)

• The average of the absolute values of
YEA Deviation from Absolute
the deviations around the mean for a R
Revenue
the mean Deviation
set of numbers 2015 233.6 29.8 29.8
2016 215.4 48.0 48.0
MAD = 2017 229 34.4 34.4
2018 265.4 -2.0 2.0
2019 260.1 20.2 20.2
• Less useful in statistics than other
2020 274.3 -10.9 10.9
measures of variability 2021 365.8 -102.4 102.4
• Occasionally used in the field of
forecasting as a measure of error
Variance

• The average of the squared deviations Apple annual revenue 2015-2021 ($bn)
about the arithmetic mean for a set of YEA
Revenue
Deviation from Squared
R the mean Deviation
numbers
2015 233.6 29.8 886.3
Population Variance 2016 215.4 48.0 2301.3
2017 229.0 34.4 1181.4
2018 265.4 -2.0 4.1
2019 260.1 20.2 10.7
2020 274.3 -10.9 119.4
2021 365.8 -102.4 10491.6
Standard Deviation

• Square root of variance

• Popular measure of variability
Standard Deviation

• Advantage over Variance: SD is expressed in the same units as the raw

data
Meaning of Standard Deviation

Empirical Rule Chebyshev’s theorem

Population vs Sample Variance and SD
z Scores

• Represents the number of standard deviations a value (x) is above or

below the mean of a set of numbers
• Only for normally distributed data
• Allows the distance of a raw data from the mean be translated into SDs

Z Scores

• If the z score is
• Positive: the raw value (x) is above the mean
• Negative: the raw value (s) is below the mean
z Scores - Example

• A normally distributed data set:

• Mean=50
• SD=10
• What is the z score for a value of 70?

Z Scores
Coefficient of Variation

• The ratio of the Standard deviation to the mean expressed in percentage

Coefficient of Variation:

• Useful in comparing the SDs that have been computed for data with
different means. Assume the following 2 data sets:
• Data A: Mean=1000, SD=5 5
• Data B: Mean=10, SD=5 𝐶 𝑉 𝐴= ( 100 )= 0.5 %
1000
5
𝐶 𝑉 𝐵= ( 100 ) =50 %
10
Let’s solve Problem 3.20…
The more dispersed the data are, the larger the
range, the interquartile range, the variance, and
the standard deviation will be.
LEFT RIGHT

True False
Measures of Shape

• Tools that can be used to describe the shape of a distribution of

data
• 2 important measures:

Skewness Kurtosis

• Box-and-whisker plots: great visualization tool

Skewness

The right half is a

Distribution skewed Distribution skewed
mirror of the left
Left: Right:
half:
Negatively Skewed Positively Skewed
Symmetrical
Skewness

• Pearsonian Coefficient of Skewness: compares the mean and

median in light of the magnitude of the standard deviation

= Median Karl Pearson

• Suppose:
• Mean=29, Median=26,SD=12.3
Should the empirical rule be used for data sets
that are highly skewed?
LEFT RIGHT

YES NO
Kurtosis

• Describes the amount of peakedness of a distribution

Box and Whisker Plot

• A.k.a Box Plot

• Determined from 5 specific numbers:
• The median ()
• The lower quartile ()
• The lower quartile ()
• The minimum
• The maximum
Box and Whisker Plot - Example

• A Suppose you have the math test results for a class of 15 students. 91 95
54 69 80 85 88 73 71 70 66 90 86 84 73
• First Sort them:
• 54 66 69 70 71 73 73 80 84 85 86 88 90 91 95
Min Lower Median Upper Max
quartile quartile
Let’s solve Problem 3.31…

Construct a box-and-whisker plot for the following data. Do the

data contain any outliers? Is the distribution of data skewed?
Let’s solve Problem 3.48…

The Globe and Mail compiled a list of the top

100 public companies in Canada according to
profit. Leading the list is the Toronto-
Dominion Bank, followed by the Bank of
Nova Scotia. The following Excel descriptive
statistics output describes the profits for these
100 companies. Study the output and
describe in your own words what you can
learn about the profits (shown in $ thousands)
of these top 100 Canadian public companies.
Thank you all, We did it!

• We just finished the first 3 chapters of the book:

• Please read the book
• solve the end of chapter questions
• review the cases
• answer the concept check questions
• Don’t worry about the formulas
• all the formulas listed at the end of the chapter will be
provided to you for the midterm and final exam.
• We will use the concepts learned so far in the next session to
solve different problems using Excel.

Statistics and Probability Reviewer
75% (12)
Statistics and Probability Reviewer
6 pages
Chapter 4 Analysis and Interpretation of Assessment Results
No ratings yet
Chapter 4 Analysis and Interpretation of Assessment Results
36 pages
Chapter 2 MCQ
No ratings yet
Chapter 2 MCQ
14 pages
Chapter1 - Statistics For Managerial Decisions
No ratings yet
Chapter1 - Statistics For Managerial Decisions
26 pages
Ken Black QA All Odd No Chapter Solution
83% (6)
Ken Black QA All Odd No Chapter Solution
919 pages
QM Notes SajinJ
No ratings yet
QM Notes SajinJ
34 pages
Data Visualization
No ratings yet
Data Visualization
18 pages
MGT 1103
No ratings yet
MGT 1103
4 pages
Business Statistics - Session 1 - 3
No ratings yet
Business Statistics - Session 1 - 3
63 pages
Quantitative Methods For Management: Term II 4 Credits MGT 408
No ratings yet
Quantitative Methods For Management: Term II 4 Credits MGT 408
49 pages
Introduction Bus Statistics
No ratings yet
Introduction Bus Statistics
32 pages
Notes (Chapter 1 - 3)
No ratings yet
Notes (Chapter 1 - 3)
15 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
49 pages
Ch1 - Basics
No ratings yet
Ch1 - Basics
28 pages
QT Summary Document 1
No ratings yet
QT Summary Document 1
45 pages
QM 1
No ratings yet
QM 1
58 pages
Quantitative Methods 3
No ratings yet
Quantitative Methods 3
174 pages
MR Statistics Basics of Data
No ratings yet
MR Statistics Basics of Data
28 pages
CH01
No ratings yet
CH01
54 pages
Session 1 BSDM
No ratings yet
Session 1 BSDM
17 pages
7u7 PDF
No ratings yet
7u7 PDF
31 pages
statistics - Unit1 pdf
No ratings yet
statistics - Unit1 pdf
94 pages
1 - Business Statistics
No ratings yet
1 - Business Statistics
82 pages
Statistical Foundation For Analytics-Module 1
No ratings yet
Statistical Foundation For Analytics-Module 1
18 pages
Lecture 1 Data and Statistics
No ratings yet
Lecture 1 Data and Statistics
33 pages
statistics-Unit1ppt
No ratings yet
statistics-Unit1ppt
94 pages
Pa 1 2024
No ratings yet
Pa 1 2024
88 pages
Week 1 Course Material
No ratings yet
Week 1 Course Material
15 pages
Introduction Data
No ratings yet
Introduction Data
32 pages
Chapter2 BI
No ratings yet
Chapter2 BI
77 pages
DS Unit 1
No ratings yet
DS Unit 1
99 pages
BBA 3415 Lecture 1 - Introduction To Business Statistics-1
No ratings yet
BBA 3415 Lecture 1 - Introduction To Business Statistics-1
24 pages
QMM Epgdm 1
No ratings yet
QMM Epgdm 1
113 pages
Bustat Reviewer
No ratings yet
Bustat Reviewer
6 pages
Descriptive_Statistics
No ratings yet
Descriptive_Statistics
73 pages
Sbe10 01
No ratings yet
Sbe10 01
7 pages
Business Statistics (Central Tendency) Mms 2022
No ratings yet
Business Statistics (Central Tendency) Mms 2022
37 pages
Statistics PDF
No ratings yet
Statistics PDF
32 pages
Ba Lecture 2
No ratings yet
Ba Lecture 2
54 pages
Statistik 2
No ratings yet
Statistik 2
34 pages
Sharda 11e Full Accessible Ppt 03
No ratings yet
Sharda 11e Full Accessible Ppt 03
31 pages
Quantitative Methods in Management: Term II 4 Credits MGT 408
No ratings yet
Quantitative Methods in Management: Term II 4 Credits MGT 408
106 pages
Session 1: Introduction Basic Concepts and Data Presentation
No ratings yet
Session 1: Introduction Basic Concepts and Data Presentation
19 pages
CS822-DataMining-Week2 (2)
No ratings yet
CS822-DataMining-Week2 (2)
28 pages
Chapter 1: Data and Statistics
100% (1)
Chapter 1: Data and Statistics
33 pages
Introduction To Business Statistics
No ratings yet
Introduction To Business Statistics
27 pages
Statistical Learning - Introduction
No ratings yet
Statistical Learning - Introduction
20 pages
Desc. Stat
No ratings yet
Desc. Stat
55 pages
Getting To Know Your Data
No ratings yet
Getting To Know Your Data
78 pages
Intro and EDA
No ratings yet
Intro and EDA
74 pages
Topic 8 Data Processing and Analysis PDF
No ratings yet
Topic 8 Data Processing and Analysis PDF
157 pages
Chapter 2 - Understand Data
No ratings yet
Chapter 2 - Understand Data
63 pages
Overall Descriptive Statistics
No ratings yet
Overall Descriptive Statistics
127 pages
QT Module-2
No ratings yet
QT Module-2
45 pages
Unit .......
No ratings yet
Unit .......
45 pages
Basic Statistics
100% (9)
Basic Statistics
73 pages
Business Mathematics and Statistics: Dr. Muhammad Arif Hussain
No ratings yet
Business Mathematics and Statistics: Dr. Muhammad Arif Hussain
39 pages
Data-Preprocessing
No ratings yet
Data-Preprocessing
138 pages
Descriptive Statistics SV
No ratings yet
Descriptive Statistics SV
77 pages
Chapter1-2 Business Stats
No ratings yet
Chapter1-2 Business Stats
125 pages
IS5740 W02
No ratings yet
IS5740 W02
37 pages
Big Data Chapter 2
No ratings yet
Big Data Chapter 2
62 pages
Data Collection: Six Sigma Thinking, #1
From Everand
Data Collection: Six Sigma Thinking, #1
Sumeet Savant
No ratings yet
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
From Everand
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
Janet Laane Effron
No ratings yet
Bus237 2
No ratings yet
Bus237 2
55 pages
Bus237 1
No ratings yet
Bus237 1
33 pages
Week 9
No ratings yet
Week 9
47 pages
Week 7 Sampling
No ratings yet
Week 7 Sampling
29 pages
Week 5 Discrete Distribution - Binomial
No ratings yet
Week 5 Discrete Distribution - Binomial
20 pages
Week 6 Distribution - Continued
No ratings yet
Week 6 Distribution - Continued
54 pages
HWK4 - Correctversion
No ratings yet
HWK4 - Correctversion
6 pages
BBSS1103 Statistical Methods
No ratings yet
BBSS1103 Statistical Methods
11 pages
The Standard Deviation and Other Measures of Dispersion
No ratings yet
The Standard Deviation and Other Measures of Dispersion
14 pages
Answers Stat 3360 Hw2
No ratings yet
Answers Stat 3360 Hw2
4 pages
Descriptive Statistics: Making Sense of Data
No ratings yet
Descriptive Statistics: Making Sense of Data
21 pages
Mathopoly PDF
No ratings yet
Mathopoly PDF
25 pages
Mastering Exploratory Data Analysis With Python - A Comprehensive Guide To Unveiling Hidden Insights
No ratings yet
Mastering Exploratory Data Analysis With Python - A Comprehensive Guide To Unveiling Hidden Insights
73 pages
Descriptive Statistics: 4 Edition David P. Doane and Lori E. Seward
No ratings yet
Descriptive Statistics: 4 Edition David P. Doane and Lori E. Seward
9 pages
Statistics Notes (All Units)
No ratings yet
Statistics Notes (All Units)
47 pages
Defining Data Science
100% (1)
Defining Data Science
167 pages
RM Data Analysis
No ratings yet
RM Data Analysis
67 pages
Prealgebra and Introductory Algebra An Applied Approach 3rd Edition Aufmann Test Bank - 2025 Version Is Available With All Chapters
100% (4)
Prealgebra and Introductory Algebra An Applied Approach 3rd Edition Aufmann Test Bank - 2025 Version Is Available With All Chapters
47 pages
Female Sexual Function of Healthy Women in Eastern Croatia
No ratings yet
Female Sexual Function of Healthy Women in Eastern Croatia
8 pages
Statistical Measures Every Analyst Must Know - Part1 - by Prof. Frenzel - Feb, 2024 - Medium
No ratings yet
Statistical Measures Every Analyst Must Know - Part1 - by Prof. Frenzel - Feb, 2024 - Medium
21 pages
Final Assignment Solution
No ratings yet
Final Assignment Solution
21 pages
Lec 5 Measures of Variation
No ratings yet
Lec 5 Measures of Variation
77 pages
PROJECT 1 STA 108 Baru
100% (1)
PROJECT 1 STA 108 Baru
26 pages
How To Calculate Range in Statistics
No ratings yet
How To Calculate Range in Statistics
5 pages
Block 4
No ratings yet
Block 4
88 pages
Solutions
No ratings yet
Solutions
8 pages
Dsbdal Lab Manual
No ratings yet
Dsbdal Lab Manual
107 pages
Mathematics
No ratings yet
Mathematics
9 pages
8th PPT Lecture On Measures of Position
0% (1)
8th PPT Lecture On Measures of Position
19 pages
Interquartile Range - Anand and Abhishek
No ratings yet
Interquartile Range - Anand and Abhishek
10 pages
Statistics
No ratings yet
Statistics
10 pages

Week 1,2 Instructor

Uploaded by

Week 1,2 Instructor

Uploaded by

BUS 232

Data and Decisions |

A science dealing with the collection, analysis, interpretation, and

Collect Analyze Interpret Present

Revenue Bond rates Salaries Forecast sale CPU time

# of “hits” on Time spent on # of Amount of Quantity of raw

Storage Age of Foreign

• Heights of a random sample of 10 male

• Test and compare the mean height of the

Example: determine the average

Most business statistics studies contain Variables, Measurement, and Data.

Big data: a collection of

Descriptive Predictive Prescriptive

quarter between April-June 2021. 30-33 4568 169 4399

bad loans or rate of 2.5% 39-42 8117 211 7906

a critical factor. 54-57 2000 26 1974

1. The distribution of loans across

Histograms Frequency Ogive Stem and leaf

Pie Charts Bar Charts Pareto Charts

Shows the relative A vertical bar chart that

Temperature °C Ice Cream Sales

• Time series data: data gathered on a particular characteristic over

• Data visualization: general observations about the shape and

• Yield information about the centre, or middle part, of a group of

Mean Median Mode Percentile Quartile

• The arithmetic mean: the average of a group of numbers

• Example: Salaries of data analysts in top 6 companies in Vancouver, BC:

• Mean salary of a data analyst:

• The median: the middle value in an ordered array of numbers

• The mode: the most frequently occurring value in a set of data

• The mode: 1.6 %

• Steps in determining the location of a percentile:

• P=the percentile of interest

• Determine the 40th percentile of the following 9 numbers:

• Step1: Sort the data from smallest to largest

• Determine the 50th percentile of the following 8 numbers:

• Step1: Sort the data from smallest to largest

• Measures of central tendency that divide a group of data into 4 subgroups

• Suppose we want to determine the values of , , and for the following

• Describe the spread or the

Interquartile Mean absolute Standard Coefficient of

• Canada Imports- Top Categories:

Apple annual revenue 2015-2021 ($bn)

Apple annual revenue 2015-2021 ($bn)

• Square root of variance

• Advantage over Variance: SD is expressed in the same units as the raw

Empirical Rule Chebyshev’s theorem

• Represents the number of standard deviations a value (x) is above or

• A normally distributed data set:

• The ratio of the Standard deviation to the mean expressed in percentage

• Tools that can be used to describe the shape of a distribution of

• Box-and-whisker plots: great visualization tool

The right half is a

• Pearsonian Coefficient of Skewness: compares the mean and

= Median Karl Pearson

• Describes the amount of peakedness of a distribution

• A.k.a Box Plot

Construct a box-and-whisker plot for the following data. Do the

The Globe and Mail compiled a list of the top

• We just finished the first 3 chapters of the book:

You might also like