0% found this document useful (0 votes)

21 views

Business Analytics Module 2

The document discusses how Amazon uses sampling to estimate inventory accuracy in its warehouses. It samples inventory continuously throughout the year instead of doing annual full inventories, which are costly. Sampling allows estimating accuracy without checking all inventory, helping Amazon improve quality and lower costs.

Uploaded by

Sayantan Das

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Business Analytics Module 2

Uploaded by

Sayantan Das

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Business Analytics

SAMPLING AND ESTIMATION

Amazon Case Study

 Amazon uses sampling to answer an important managerial question. It decided

long ago that our company's mission is to be Earth's most customer centric
company. Amazon are obsessed with the customer experience. So when amazon
have an opportunity to improve the customer experience through analytics, they
usually focus on the thing that is likely to have the highest impact on customer
experience, positive impact, and the broadest possible impact, because we're a
global company. So they look for things like low prices, huge selection,
improvements in the delivery experience and convenience that are likely to apply
for a long period of time everywhere in the world. When amazon ship items to
customers, they come from a warehouse where they store inventory. The way that
amazon process inventory is that way we receive a truck that has books, consumer
electronics items, toys, kitchen, sports, clothes, shoes. The truck comes in. Amazon
receive the items, which basically means that they open up a carton and take out
the items and make sure that they're in good shape. And then we stow them into a
shelf, waiting on the customer orders that will eventually come to ship to customers.
The places for errors in this process include misidentifying the item at receive.
Amazon Case Study

 So we think we have black shoes, and someone's made a mistake and

identified them as blue shoes. We could place the item into the wrong bin. We
could pull the wrong item from the shelf, and then there are a couple of other
smaller ways that we might make mistakes. We're trying to minimize the defects
to customers, meaning minimize the chance that a customer would receive
the wrong item or receive a delay because the last item that we have is in the
wrong place. And we're trying to reduce our costs to deal with those kinds of
defects at the same time, so improve quality and lower costs. The best to do
that is to have as few defects as possible in our inventory. Years ago, way
before Amazon.com, retail learned that inventory accuracy matters in stores
and in warehouses, and retailers got accustomed to annual counts, annual
inventory accounts. Often, stores would close for a day or two days, or
sometimes a week. Warehouses would do the same, and humans would go
out into the warehouse and count every item, make sure that they knew what
was where in the warehouse, and then you would reopen and start selling
again.
Amazon Case Study

 That's a very expensive process, because you actually have to close your
operation during the time that the warehouse is closed. And you also don't
have the benefit of knowing whether you're perfect in your inventory
throughout the rest of the year. You basically have one sample. It's a complete
sample, but it's one sample, once a year, and then you hope that your
processes are good enough the rest of the year. What we've learned to do is
to sample our inventory continuously, sample the accuracy of our inventory
continuously, to make sure that we have as accurate an inventory as we can
afford to have. The idea behind sampling is it might not be possible for you to
learn the true value of a statistic of interest in the population. We have many
warehouses that house that inventory. Going through all that would be very,
very time consuming. And the idea behind sampling in that situation would be
you would at random pick a subset of the items in inventory, and ask whether
they had those defects. So it's a lower cost way to learn the rate at which the
statistic of interest occurs in the population.
Sample vs. Population

 In the previous module, we learned

about descriptive statistics. The
numerical properties of a
population are called parameters
and those of a sample are called
statistics. A statistic is an estimate of
a true value of a parameter. If a
sample is sufficiently large and is
representative of the population,
the sample statistics should be
reasonably good estimates of the
population parameters.
Sample vs. Population

 To differentiate between population

and sample measures, we use the
Greek alphabet for population
parameters, and the Latin alphabet
for sample statistics. The symbols for
the mean and standard deviation
are summarized in the table below.
Important Pointers

 What happens to the sample mean and standard deviation as you take
new samples of equal size?

 Since each sample is randomly selected, the mean and standard

deviation vary from one sample to the next. However, since the sample
size is fairly large, each sample’s mean and standard deviation are fairly
close to the population mean and standard deviation. We’ll learn more
about how to select a good sample later.
Selection of Random Sample

 In some cases, selecting a random sample is quite straightforward. If we

have a list of all members of a population in a database, we can use a
computer to assign a random number to each member and draw a
sample from the list. This process makes sure that each member—that is,
each element of the population—has an equal likelihood of being
selected, which ensures that the sample is representative of the
population.
 We can use the RAND function to generate random numbers between
any two specified values. For example, if we wanted to generate random
numbers between 0 and 10 we would multiply the function by 10 and
enter =RAND()*10. If we wanted numbers between 5 and 15, we would
enter =5+RAND()*10.
Case Study

 Suppose a college has asked you to conduct a survey to determine the

percentage of 8:00 AM classrooms that were full on a given morning. The
college has three classroom buildings, each containing two lecture halls. Each
lecture hall has a capacity of 100 students. You randomly choose one of three
buildings, and stand outside the entrance when classes let out. You ask the first
60 students leaving the building how full their class was. However, you soon
realize that this sample is not random because you only went to only one of the
buildings and the classes at that building may not be representative of all 8:00
AM classes. Moreover, since the students you surveyed were the first to exit the
building, it’s also quite possible that they all came from the same class!
 Realizing that your survey approach would not produce a random and
representative sample, you gather some friends to help sample. You place one
surveyor outside each building. You each randomly select 20 students leaving
the buildings that morning and tally the results: 5 people decline to participate,
35 tell you that their class was full, and 20 tell you that their class was not full.
Explanation

 This question is a bit tricky. This sample still may not be representative of all
classes because there is a bias in the approach. When you sample
students leaving each of the buildings, you will, on average, select more
people from full classes, simply because there were more people in those
classes. Imagine that of the 6 classes that took place that morning, 4 were
full (each having 100 students) and 2 had only 40 students each. In this
case, most of the students, 400 of the total 480, were in full classes. Your
sample would include more students from the full classes and therefore is
not representative of all classes that took place that morning.
Sample Size

 In addition to deciding how to select a random sample, we also must

determine how large the sample should be. The appropriate sample size
depends on how accurate we want our estimates of the population
parameters to be. Suppose we want to sample from two populations—the
first population comprises 5,000 observations and the second population
comprises 5 million.
 If we take a sample of size 1,000 from the first population, how many times
larger does the sample need to be from the second population to ensure
the same level of accuracy?
Explanation

 We might expect that for a larger population, a larger sample size is

needed to achieve a given level of accuracy, but this is not necessarily
true. A sample of 1,000 is often a satisfactory representation of a
population numbering in the millions, as long as the sample is randomly
selected and representative of the entire population.
Sample Size

 The graphic below suggests the general relationship

between accuracy and sample size. Later in this module,
we will learn how to calculate the minimum required
sample size to ensure a specified level of accuracy.
 Although we don’t necessarily have to increase the
sample size for larger populations, we may need a larger
sample size when we are trying to detect something very
rare. For example, if we are trying to estimate the
incidence of a rare disease, we may need a larger
sample simply to ensure that some people afflicted with
the disease are included in the sample.
Avoiding Bias

 A common way to collect information about a population is to conduct a

survey. That is, a researcher asks questions of a randomly selected sample
from the population. Conducting a survey raises problems that can be
surprisingly tricky to resolve. Consider how we phrase our questions. Is there
bias in the phrasing that might lead participants to answer the questions in
a certain way? Are any questions worded ambiguously? If some of the
people in the sample interpret a question one way, and others interpret it
differently, the survey’s results will be meaningless.
Questions

 Suppose you are an aspiring politician thinking about running for local
office. You decide to conduct a survey to get a sense of whether you
actually have a chance of winning. Which method would you use?

 In-person
 Mail
 Phone
 E-mail
 Text
 Social Media
Avoiding Bias

 Surveyors wish to get as high a response rate as possible. Low response

rates can introduce bias if the non-respondent’s answers would have
differed from those who responded—that is, if the non-respondents and
the respondents represent different segments of the population. If we do
not represent a segment of the population, then our sample is not
representative of the population. If resources are limited, it is often better
to take a small sample and relentlessly pursue a high response rate than to
take a larger sample and settle for a low response rate. If we have a low
response rate, we should contact non-respondents and try to either
increase the response rate or demonstrate that the non-respondents’
answers do not differ from the respondents’ answers.
Avoiding Bias
Normal Distribution

 After we obtain a sample, we will analyze the sample to draw inferences

about the greater population. To understand how to do this, it is helpful to
understand the basic characteristics of a common probability distribution
known as the Normal Distribution.
 Because the normal distribution is a continuous probability distribution, the
probability of the normal distribution equaling any particular value is zero
(this is why we only assess the probability of a range for a continuous
distribution). Because of this, we can use the terms “less than” and “less
than or equal to” interchangeably when calculating probabilities for
continuous distributions.
Excel Formulae

 To find a cumulative probability, the probability of being less than a specified

value on a normal curve, we use Excel’s NORM.DIST function.
 =NORM.DIST(x, mean, standard_dev, cumulative)
 x is the value at which you want to evaluate the distribution function.
 mean is the mean of the distribution.
 standard_dev is the standard deviation of the distribution.
 cumulative is an argument that specifies the type of probability we wish to
calculate. We insert “TRUE” to indicate that we wish to find the cumulative
probability, that is, the probability of being less than or equal to the x-value.
(Inserting the value “FALSE” provides the height of the normal distribution at the
value x, which we will not cover in this course.)
Excel Formulae

 For a normal distribution, we can use Excel’s NORM.INV function to

calculate a given percentile. The “INV” indicates that this function
calculates the inverse of the cumulative probability.
 =NORM.INV(probability, mean, standard_dev)
 probability is the cumulative probability for which we want to know the
corresponding x-value on a normal distribution.
 mean is the mean of the distribution.
 standard_dev is the standard deviation of the distribution.
Central Limit Theorem

 The Central Limit Theorem is one of the most subtle statistical concepts,
and is worth understanding because it gives us much deeper insight into
how sampling actually works. The Central Limit Theorem says that if we
take many random samples from a population and plot the means of
each sample, then assuming the samples we take are sufficiently large,
the resulting plot of the sample means will look normally distributed.
Furthermore, if we take enough of these samples, the average of the
sample means will be equal to the true mean of the population. Thus, we
show this graph called the distribution of sample means as a normal curve
centered at the true population mean.
Central Limit Theorem

 How does the distribution of sample means differ from the distribution of the
population? The most important difference is that the two distributions have
different standard deviations. Since the width of the distribution of sample
means is affected by the sample size, larger samples will result in a more narrow
distribution of sample means. This should reinforce our intuition, because we
know that the larger the sample size, the more accurately the sample mean
approximates the population mean. Thus, for larger samples, the resulting
distribution of sample means will be more closely clustered around the
population mean. One of the most amazing findings about the Central Limit
Theorem is that no matter what type of distribution the population has, uniform,
skewed, bimodal, or completely bizarre, if we take enough sufficiently large
samples, then the means of those samples will form a normal distribution
centered at the true population mean. Let's walk through this step by step.
Central Limit Theorem

 If we have a population, any population, we can take a random sample from that
population. That sample has a mean. We can plot that mean on a graph. Then we can
take another sample. That sample also has a mean, which we also plot on the graph.
Now, if we plot a lot of sample means in this way, they will start to form a normal
distribution around the population mean. The more samples we take, the more the graph
of the sample means will look like a normal distribution. Eventually we would form the
distribution of sample means. Now, no one would actually take a lot of samples,
calculate all the sample means, and then construct a normal distribution with them. In the
real world, we take a single sample and squeeze it for all the information it's worth. The
Central Limit Theorem is a powerful tool for sampling and estimation, because it allows us
to ignore the underlying distribution of the population we want to learn about. We now
know that the mean of a sample is part of a normal distribution. Specifically, we know
that the sample mean falls somewhere in a normal distribution that is centered at the true
population mean. Because of this, we can completely disregard the underlying
distribution of the population and focus only on the sample.
Questions ???

Ba LabMaster E 2518902 200040 - 355 - 06
No ratings yet
Ba LabMaster E 2518902 200040 - 355 - 06
53 pages
BA Module 02 - 2.1 + 2.2
No ratings yet
BA Module 02 - 2.1 + 2.2
12 pages
S4800 Supplementary Notes1 PDF
No ratings yet
S4800 Supplementary Notes1 PDF
26 pages
Stable Convergence and Stable Limit Theorems: Erich Häusler Harald Luschgy
No ratings yet
Stable Convergence and Stable Limit Theorems: Erich Häusler Harald Luschgy
231 pages
Cấu Trúc Thi Market Research
No ratings yet
Cấu Trúc Thi Market Research
57 pages
Chapter 4.3 ZICA
No ratings yet
Chapter 4.3 ZICA
39 pages
What Is Sampling?
No ratings yet
What Is Sampling?
14 pages
Lecture 13
No ratings yet
Lecture 13
44 pages
Sampling A Level
No ratings yet
Sampling A Level
14 pages
Random Sampling Dissertation
100% (1)
Random Sampling Dissertation
4 pages
Sampling Method Final
No ratings yet
Sampling Method Final
10 pages
Chapt 6 Evans Methods in Psychological Research 3e
No ratings yet
Chapt 6 Evans Methods in Psychological Research 3e
15 pages
Stratified Sampling: What Is Non-Probability Sampling?
No ratings yet
Stratified Sampling: What Is Non-Probability Sampling?
5 pages
Sampling
No ratings yet
Sampling
2 pages
Unit 3 Logical Research New PDF
No ratings yet
Unit 3 Logical Research New PDF
21 pages
An Introduction To Sampling Methods
No ratings yet
An Introduction To Sampling Methods
6 pages
Lecture Slides 12_Sampling and the Central Limit Theorem
No ratings yet
Lecture Slides 12_Sampling and the Central Limit Theorem
40 pages
Business Research Method Unit 3 (1)
No ratings yet
Business Research Method Unit 3 (1)
14 pages
Topic 3 - ETC1000
No ratings yet
Topic 3 - ETC1000
10 pages
3sampling True
No ratings yet
3sampling True
43 pages
Sampling and Statistical Inference: Eg: What Is The Average Income of All Stern Students?
100% (1)
Sampling and Statistical Inference: Eg: What Is The Average Income of All Stern Students?
11 pages
Presentation-WPS Office
No ratings yet
Presentation-WPS Office
22 pages
Random Sampling Thesis
100% (3)
Random Sampling Thesis
7 pages
Sampling Methods
No ratings yet
Sampling Methods
9 pages
Portion 3
No ratings yet
Portion 3
32 pages
Literature Review On Simple Random Sampling
100% (1)
Literature Review On Simple Random Sampling
6 pages
An Introduction To Sampling Methods: Population Vs Sample
No ratings yet
An Introduction To Sampling Methods: Population Vs Sample
6 pages
Reggie Assignment
No ratings yet
Reggie Assignment
6 pages
Sampling Methods
No ratings yet
Sampling Methods
14 pages
3 - 2 - Unit 3, Part 1 - (1) Sampling Variability and CLT (21-00)
No ratings yet
3 - 2 - Unit 3, Part 1 - (1) Sampling Variability and CLT (21-00)
11 pages
Sampling Methods
No ratings yet
Sampling Methods
35 pages
Sample Surveys and Sampling Methods
No ratings yet
Sample Surveys and Sampling Methods
28 pages
Sampling
No ratings yet
Sampling
4 pages
Sampling Mixed Research Methods
No ratings yet
Sampling Mixed Research Methods
10 pages
Sampling Techniques
No ratings yet
Sampling Techniques
23 pages
ROHAN BRM Assignment
No ratings yet
ROHAN BRM Assignment
6 pages
EBH R3 Populations and Samples: Objectives
No ratings yet
EBH R3 Populations and Samples: Objectives
6 pages
Sampling Techniques
No ratings yet
Sampling Techniques
21 pages
Sampling Methods
No ratings yet
Sampling Methods
11 pages
Random Sampling Techniques Thesis
100% (3)
Random Sampling Techniques Thesis
5 pages
Business Research
No ratings yet
Business Research
13 pages
Sampling
No ratings yet
Sampling
29 pages
Lesson 5 Notes
No ratings yet
Lesson 5 Notes
10 pages
Sampling
No ratings yet
Sampling
5 pages
Cochran's Formula
No ratings yet
Cochran's Formula
10 pages
Techniques of Sampling
No ratings yet
Techniques of Sampling
5 pages
Unit Viii Designing The Sample Plan and Size
No ratings yet
Unit Viii Designing The Sample Plan and Size
12 pages
Module Eight Lesson One Notes Guided Notes
No ratings yet
Module Eight Lesson One Notes Guided Notes
20 pages
Handout - Sampling Strategy
No ratings yet
Handout - Sampling Strategy
12 pages
Table
No ratings yet
Table
32 pages
Samplingandinference128 110430234416 Phpapp02
No ratings yet
Samplingandinference128 110430234416 Phpapp02
39 pages
809-Samyak Patwa-MR Project
No ratings yet
809-Samyak Patwa-MR Project
14 pages
Business Research - Sectional 2 - Spring 2024
No ratings yet
Business Research - Sectional 2 - Spring 2024
6 pages
Chapter1 Stats
No ratings yet
Chapter1 Stats
7 pages
RESEARCH DEVELOPMENT Lesson 6
No ratings yet
RESEARCH DEVELOPMENT Lesson 6
17 pages
Lesson 3 - Sampling
No ratings yet
Lesson 3 - Sampling
32 pages
Ampling Used in Research Work
No ratings yet
Ampling Used in Research Work
8 pages
RM UNIT 3
No ratings yet
RM UNIT 3
14 pages
Sampling in Research
100% (2)
Sampling in Research
8 pages
CFA Level 1 Quantitative Analysis E Book - Part 4
No ratings yet
CFA Level 1 Quantitative Analysis E Book - Part 4
40 pages
Sample and Sample Size
No ratings yet
Sample and Sample Size
29 pages
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Disease Mapping
No ratings yet
Disease Mapping
35 pages
Graph
No ratings yet
Graph
4 pages
GP 4424
No ratings yet
GP 4424
36 pages
Dcs880 Profibus
No ratings yet
Dcs880 Profibus
10 pages
Introduction To COMSOL Multi Physics
No ratings yet
Introduction To COMSOL Multi Physics
168 pages
Bernard Et Al 01 (Biotechnol Bioeng) PDF
No ratings yet
Bernard Et Al 01 (Biotechnol Bioeng) PDF
15 pages
Module 1 Intro To Stat
No ratings yet
Module 1 Intro To Stat
27 pages
Business To Manufacturing Markup Language B2MML - Production Schedule
No ratings yet
Business To Manufacturing Markup Language B2MML - Production Schedule
18 pages
Fullprof Ps
No ratings yet
Fullprof Ps
34 pages
Machine Learning in Non Stationary Environments Ab00 PDF
No ratings yet
Machine Learning in Non Stationary Environments Ab00 PDF
263 pages
Pastor Stambaugh
No ratings yet
Pastor Stambaugh
54 pages
Linear Programming
No ratings yet
Linear Programming
45 pages
Adaptive Control
No ratings yet
Adaptive Control
26 pages
JETIR2205547
No ratings yet
JETIR2205547
9 pages
Financialization, Crisis and Commodity Correlation Dynamics
No ratings yet
Financialization, Crisis and Commodity Correlation Dynamics
24 pages
DesignXplorer Users Guide
No ratings yet
DesignXplorer Users Guide
410 pages
STAT-II Week End
100% (2)
STAT-II Week End
57 pages
Hioki - 3535
No ratings yet
Hioki - 3535
8 pages
Chapter 7 - Sampling Distributions CLT
No ratings yet
Chapter 7 - Sampling Distributions CLT
17 pages
SAP PM Training
No ratings yet
SAP PM Training
163 pages
EVDS_Web_Service_Usage_Guide
No ratings yet
EVDS_Web_Service_Usage_Guide
17 pages
Ultrasound Imaging
No ratings yet
Ultrasound Imaging
22 pages
000.chapter8 Cumulative PDF
No ratings yet
000.chapter8 Cumulative PDF
19 pages
T-Test MCQ (Free PDF) - Objective Question Answer For T-Test Quiz - Download Now!
0% (1)
T-Test MCQ (Free PDF) - Objective Question Answer For T-Test Quiz - Download Now!
14 pages
Assignment 7: - RESET
No ratings yet
Assignment 7: - RESET
11 pages
FEMU Main
No ratings yet
FEMU Main
40 pages
TOPAS UserGuide V1 3 20151026a PDF
No ratings yet
TOPAS UserGuide V1 3 20151026a PDF
126 pages
Capricorn Copper Best Practice For Processing UAV Data
No ratings yet
Capricorn Copper Best Practice For Processing UAV Data
7 pages

Business Analytics Module 2

Uploaded by

Business Analytics Module 2

Uploaded by

Business Analytics

SAMPLING AND ESTIMATION

 Amazon uses sampling to answer an important managerial question. It decided

 So we think we have black shoes, and someone's made a mistake and

 In the previous module, we learned

 To differentiate between population

 Since each sample is randomly selected, the mean and standard

 In some cases, selecting a random sample is quite straightforward. If we

 Suppose a college has asked you to conduct a survey to determine the

 In addition to deciding how to select a random sample, we also must

 We might expect that for a larger population, a larger sample size is

 The graphic below suggests the general relationship

 A common way to collect information about a population is to conduct a

 Surveyors wish to get as high a response rate as possible. Low response

 After we obtain a sample, we will analyze the sample to draw inferences

 To find a cumulative probability, the probability of being less than a specified

 For a normal distribution, we can use Excel’s NORM.INV function to

You might also like