System Simulation
Dr. Dessouky
Description
Simulation is a very powerful and widely used management
science technique for the analysis and study of complex systems.
Simulation may be defined as a technique that imitates the
operation of a real-world system as it evolves over time. This is
normally done by developing a simulation model. A simulation
model usually takes the form of a set of assumptions about the
operation of the system, expressed as mathematical or logical
relations between the objects of interest in the system.
Simulation has its advantages and disadvantages. We will focus
our attention on simulation models and the simulation technique.
Simulation
 What is simulation:
The process of designing a mathematical
or logical model of a real-system and then
conducting computer-based experiments
with the model to describe, explain, and
predict the behavior of the real system.
Simulation
 Where simulation fits in
Programming
Simulation
Analysis
Modeling
Probability &
Statistics
Basic Terminology
 In most simulation studies, we are concerned with
the simulation of some system.
 Thus, in order to model a system, we must
understand the concept of a system.
 Definition: A system is a collection of entities that
act and interact toward the accomplishment of some
logical end.
 Systems generally tend to be dynamic  their status
changes over time. To describe this status, we use
the concept of the state of a system.
Example Simulation Model
Ford - # of Panels per day (throughput)
Emergency Room  (beds, doctors, nurses), (minor, moderate, major,
critical)
TRW  Ballistic Missile Survivability against Soviet Threat
Paramount Farms  Pistachio
Miami University Parking
HMT  Disks Throughput
Christopher Ranch  Garlic Capacity
Power Integration  Semiconductor Capacity, and random machine
down times
Value of Simulation
 Empirical Method verses mathematical
model
 Allow you to calculate the extreme values
not just the expected value
Simulation
 What is simulation
 Simulationistheactualrunningofthe
modelsystemtogaininsightintoits
performance.
Simulation
 Why use simulation
 Simulationisusedtobetterunderstandthe
expectedperformanceoftherealsystem
andtotesttheeffectivenessofthesystem
design.
Simulation
 Why use simulation
 Without building them
 experimental system
 new concepts
 Without disturbing them
 costly experimentation
 unsafe experimentation.
 Without destroying them
 Determine limits of stress
Queuing systems
 Performancemeasures(output)
 Datarequirements(input)
 Usesofmodel
 Kendallsnotation
Queuing systems
 SystemPerformancemeasures(outputs)
Expectednumberofcustomersinsystem
Expectednumberofcustomersinqueue
Expectedtimeinsystem
Expectedtimeinqueue
Serverutilization
Probabilityofncustomersinsystem
Throughput
Queuing systems
 Datarequirements(Inputs)
Interarrivaltimedistribution
Servicetimedistribution
Numberofservers
Queuediscipline
Systemcapacity
Sizeofinputpopulation
Kendallsnotation(M/M/s/FCFS/K/M)
Alternativetosimulation
Simulation
Analytic models
Physical experimentation
Visit other sites
Simulation vs. analytic modeling
 Advantage:
various performance measures
greater realism
easier to understand
model the steady-state as well as the transit
behavior.
 Disadvantage:
 May not provide you with the optimal solution
 time to construct model will be longer.
Simulation vs. Physical
 Advantage:
High Speed
Not disruptive
Replication easy
Control variations
Generally less costly
 Disadvantage:
 Realism
 Validity
Simulation vs. Alternatives
Realism
V
Cost
V
Representing system
 System:
 a collection of mutually interacting objects
designed to accomplish a goal (machines
repair system)
 Entities:
 denotes an element/object within boundary of
system (machines, operators, repairman)
 Entity  work being performed on object
 Resource  performing the work
Representing system
 Attribute:
 Characteristic or property or an entity
(machine ID, Type of breakdown, time that
machine went down)
 Activity:
 transforms the state of an object usually over
some time (repairman service time, machine
run time)
Representing system
 State of the system:
 Numeric values that contain all the
information necessary to describe the system
at any time.
 Delays:
 Processes that take a conditional length of
time in the system
Representing system
 Events:
 Change the state of the system(end of service
of machine,machine breaks down)
 Queue:
 it is set, used to model waiting
Ex. Elevator systems
 Entities
 Elevators, people
 Sets
 People waiting at each floor
 Attributes
 Elevators  capacity, speed, destination,
current location of each elevator
 People  inter-arrival time at each floor,
destination of each people
Ex. Elevator systems
 State of system:
 # of people on each elevator
 # of people in each floor
 Activities
 Load/Unloading passenger
 Travel to next floor (speed and distance)
 Persons travel to elevator
Ex. Elevator systems
 Delays:
 Persons waiting for elevator
 Events:
Elevator arrival
End unloading
End Loading
Person Arrival
Static Simulation vs. Dynamic
Simulation
 There are two types of simulation models,
static and dynamic.
 Definition: A static simulation model is a
representation of a system at a particular
point in time.
 We usually refer to a static simulation as a
Monte Carlo simulation.
Static Simulation vs. Dynamic
Simulation
 Definition: A dynamic simulation is a
representation of a system as it evolves
over time.
 Within these two classifications, a
simulation may be deterministic or
stochastic.
 A deterministic simulation model is one
that contains no random variables; a
stochastic simulation model contains one
or more random variables.
Discrete Event vs. Continuous
Event Simulation
 Discrete event:
 state of system changes only at discrete points
in time(events)
 ex. Machine repair problem
 Programming
 Look at system only when events occur; time is
advanced from event to event.
Discrete Event vs. Continuous
Event Simulation
 Continuous event:
 state of system changes continuously over
time
 Ex. Level of fluid in tank
 Programming:
 Advances time in small intervals. Use differential
equations to represent flows.
An Example of a Discrete-Event
Simulation
 To simulate a queuing system, we first have to
describe it.
 We assume arrivals are drawn from an infinite
calling population.
 There is unlimited waiting room capacity, and
customers will be serve in the order of their arrival
(FCFS).
 Arrivals occur one at a time in a random fashion.
 All arrivals are eventually served with the
distribution of service teams as shown in the book.
 Service times are also assumed to be random.
After service, all customers return to the calling
population.
 For this example, we use the following variables
to define the state of the system: (1) the number of
customers in the system; (2) the status of the
server  that is, whether the server is busy or idle;
and (3)the time of the next arrival.
 An event is defined as a situation that causes the
state of the system to change instantaneously.
 All the information about them is maintained in a
list called the event list.
 Time in a simulation is maintained using a variable
called the clock time.
 We begin this simulation with an empty system and
arbitrarily assume that our first event, an arrival,
takes place at clock time 0.
 Next we schedule the departure time of the first
customer.
Departure time = clock time now + generated service time
 Also, we now schedule the next arrival into the system
by randomly generating an interarrival time from the
interarrival time distribution and setting the arrival time
as
Arrival time = clock time now + generated interarrival time
 Both these events are their scheduled times are
maintained on the event list.
 This approach of simulation is called the next-event
time-advance mechanism, because of the way the
clock time is updated. We advance the simulation clock
to the time of the most imminent event.
 As we move from event to event, we carry out the
appropriate actions for each event, including any
scheduling of future events.
 The jump to the next event in the next-event mechanism
may be a large one or a small one; that is, the jumps in
this method are variable in size.
 We contrast this approach with the fixed-increment
time-advance method.
 With this method, we advance the simulation clock in
increments of t time units, where t is some
appropriate time unit, usually 1 time unit.
 For most models, however, the next event
mechanism tends to be more efficient
computationally.
 Consequently, we use only the next-event
approach for the development of the models for
the rest of the chapter.
 To demonstrate the simulation model, we need to
define several variables:
 TM = clock time of the simulation
 AT = scheduled time of the next arrival
DT = scheduled time of the next departure
SS = status of the server (1=busy, 0=idle)
WL = length of the waiting line
MX = length (in time units) of a simulation run
 We now begin the simulation by initializing
all the variables. This simple example
illustrates some of the basic concepts in
simulation and the way in which simulation
can be used to analyze a particular problem.
World View  The Structure concepts and views
under which the simulation is guided for the
development of the simulation model
 Event Orientation  defines the changes in state that
occur at each event time
 Process Orientation  describes the process through
which the entities in the system flow
 Activity Scanning Orientation  describes the activities in
which the entities in the system engage
Discrete Event Simulation
 Event scheduling
 Write modules that describe changes in the
state of the system at each event
 Main program advances time
 One subprogram for each event
 General purpose programming language
Discrete Event Simulation
 Process interaction
 Write modules that describe the progress of
entities through the system
 As entities move the systems changes state
 Entities are held to represent activities and
delays
 Promodel programming language
Event scheduling
 Time is advanced from event to event
 Future events list  ordered list of
upcoming events
 As events are scheduled, they are added to the
list
 As events occur they are removed from list
 Activities in event ( one / event type)
Event scheduling
 List is required to keep track of entities in
a set
 Statistics  Two types
 Sample statistics  average of some values
(W)
 W = (W1 +W2 + +Wn)/n = Total Wait / # of wait
 Time average statistics  time weighted (L)
 L = (0(t1) + 1(t2-t1) + 2(t3-t2) + 1(t4-t3)) / t4
Activity scanning
 Activity scanning
 Time is modeled in fixed time increments to
check if activity occurred
 Small time increments is inefficient
 Large time increments may miss activity
 describes the activities in which the entities in
the system engage.
Process Oriented
 Process oriented:
 Many simulation models include elements
which occur in defined patterns
 The logic associated with such a system or
events can be generalized and defined by a
single statement
 A simulation language could then translate
such statement into the appropriate sequence
of events
 describes the processes through which the
entities in the system flow.
Process Oriented
 Process oriented:
 These statements, define a sequence of events
which are automatically executed by the
simulation language as the entities move
through the process
 Create arrival entities every t time units
 However, since we are normally restricted to a
set of standardized statement, provided by the
simulation language, our model flexibility is
not as great as with the event condition
Feature provided by a language
 Conceptual framework(entities, attributes,
resource, queues)
 Maintenance of event list
 Random variable generation
 Animation
 Debugging function
 Output analysis
 Input analysis
 Report generation
Simulation Languages
 One of the most important aspects of a simulation
study is the computer programming.
 Several special-purpose computer simulation
languages have been developed to simplify
programming.
 The best known and most readily available
simulation languages, including GPSS, GASP IV
and SLAM.
 Most simulation languages use one of two different
modeling approaches or orientations; event
scheduling or process interaction.
 GPSS uses the process-interaction approach.
 SLAM allows the modeler to use either approach
or even a mixture of the two, whichever is the
most appropriate for the model being analyzed.
 Of the general-purpose languages, FORTRAN is
the most commonly used in simulation.
 In fact, several simulation languages, including
GASP IV and SLAM, use a FORTRAN base.
 To use GASP IV we must provide a main program,
an initialization routine, and the event routines.
 For the rest of the program, we use the GASP
routines.
 Because of these prewritten routines, GASP IV
provides a great deal of programming flexibility.
 GPSS, in contrast to GASP, is a highly structured
special-purpose language.
 GPSS does not require writing a program in the
usual sense.
 Building a GPSS model then consist of
combining these sets of blocks into a flow
diagram so that it represents the path an
entity takes as it passes through the system.
 SLAM was developed by Pritsket and
Pegden (1979). It allows us to develop
simulation model as network models,
discrete-event models, continuous models, or
any combination of these.
 The decision of which language to use is one of
the most important that a modeler or an analyst
must make in performing a simulation study.
 The simulation language offer several advantages.
 The most important of these is that the specialpurpose languages provide a natural framework
for simulation modeling and most of the features
needed in programming a simulation model.
The Simulation Modeling Steps
 We now discuss the process for a complete
simulation study and present a systematic
approach of carrying out a simulation.
 A simulation study normally consists of several
distinct stages. (See Figure in the book)
 However, not all simulation studies consist of all
these stages or follow the order stated here.
 On the other hand, there may even be considerable
overlap between some of these stages.
Problem/Model Formulation
 State the objective of the study.
 Identify the Problem. Determine any underlying
causes if possible.
 Determine the input variables.
 Controllable Variables.
 Uncontrollable Variables.
 Make assumptions / boundaries that were used to
simplify the model.
 Determine Performance measures used to
measure the objective. (Output)
Data collection/acquisition
 Determine the Data Collection System or
Estimates to be used.
Observe the system
Historical or Similar Systems
Theoretical Estimates
Engineering Estimates
Operator Estimates
Vendor Estimates
 Identify the data collected.
 How it was collected.
 How it was represented in the model.
Model Construction or
Development
 Identify The Real System
 Determine Conceptual Model -Activities
and Events
 Develop the Logical Model.
 Identify the Programming Language used.
 Computer Implementation (Promodel,
Arena, Slam Systems).
Model Construction or
Development
 Modeling Tips
Art vs. Science
Over Simplification vs. Unnecessary Detail
Start Simple
Add stronger assumptions
Model Verification and
Validation
 Verification: Determining whether
simulation model works as intended.
 Verifying the Model.
Structure: Walk Through of the Model
Debugger.
Trace = print or writing in process calculations.
Animation.
Model testing
Analytical Model.
Model Verification and
Validation
 Verification.
 Logical Model.
 Are events represented correctly?
 Are mathematical formulas and relationships
correct?
 Are statistical measures formulated correctly?
 Computer Model/Simulation Model.
 Does the code contain all aspects of the logical
model?
 Are the statistics and formulas calculated correctly?
 Does the model contain coding errors?
Model Verification and
Validation
 Validation:Determine whether Simulation
of The Model is a credible representation
of a Real System.
Compare the model with the actual systems
by performing statistical tests. T-Test &
C.I.
 Conceptual Model.
 Does the model contain all relevant elements,
events and relationships?
 Will the model answer the questions of concern?
Model Verification and
Validation
 Logical Model.
 Does the model contain all events included in the
conceptual model?
 Does the model contain all the relationships of the
conceptual model?
 Computer Model/Simulation Model.
 Is the computer model a valid representation of the
real system?
 Can the computer model duplicate the performance
of the real system?
 Does the computer model output have credibility
with system experts and decision makers?
Experimentation and Analysis of
Results
 Experimentation  The execution of the
simulation model to obtain output values
 Analysis of Results  The process of analyzing
the simulation outputs to draw inferences and
make recommendations for problem resolution
Implementation and Documentation
 The process of implementing decisions
resulting from the simulation and
documenting the model and its use.
Manual Simulation Example
 Given the following arrival times for a single
server system what will be the average number
in the queue, average number in the system,
average time in system, average time in queue,
the number of completed jobs, number in the
queue, number in the system, and server
utilization at time 15 if the service time is 3 time
units for each entity.
 1, 3, 5, 9,13,15,17
Data Collection
 Activities may be represented as
 Constants
 Random variables
 Collection of data
 Design a data collection form
 Record more than single attribute in case you
need to use data in a different way.
 Use several session to get representative data
 Use control charts
Data Collection
Machine
Begin Repair End Repair
Time
Elapsed
Data Collection
 Testing data
 Independence
 Randomness
 Homogeneity
Data Collection
 Test of Independence
 Ho: Measure A is independent of measure B
 H1: Measure A is not independent of measure
B.
Inventory and day of week
Data Collection
 Test of Randomness
 Ho: f(xi/xj) = f(xi) =Independent
 Hi: f(xi/xj) f(xi) : Dependent
 For example, when simulation a production
process in which the items can be defective or
good, it would be important to know if
successive items are randomly distributed with
reputation good items followed by some of
defective items.
Data Collection
 Test of Homogeneity
 Tests for whether multiple sets of data can be
considered as coming from statistical
population are generally referred to as tests of
homogeneity distribution free.
 Ho : G(x) =H(x)
 H1 : G(x)  H(x)
 Two different workers working on the same
machine.
Random Variable
 Two types
 Discrete
 Continuous
Random Variable
 Probability mass function
 Discrete
 P(X = xi) = p(xi)
 p(xi) = 1
Random Variable
Probability density function
Continuous
f(x) = e x x > 0
P(X = a) = 0
 -  f(x) dx = 1
 P(a < x < b) = ab f(x) dx
Random Variable
 Cumulative distribution function (CDF)
 F(X) = P(X <= x)
X<x
p(xi)
 - x f(x) dx
Random Variable
 Expected value
  = E(x)
  =  xi p (xi)
  =  x f(x) dx
Random Variable
 Variance
V ( x )  E[( x   ) 2 ]
 E[ x 2  2 x   2 ]
E
 x   ( E ( x))
2
i
p ( xi )  (
xi p( xi))
Random Variable
 Standard deviation
SD ( X )   
 Sums of R.V.
V (X )
Y  a1 x1  a2 x2
E ( y )  a1 E ( x1 )  a2 E ( x2 )
2
V (Y )  a1 V ( x1 )  a2 V ( x2 )
Random Variable
SampleMean  X
SampleVari ance  S
(X  X )
n 1
2
i
nx
n 1
Poisson Probability Distribution
Consider a discrete r.v. which is often useful
when dealing with the number of occurrences
of an event over a specified interval of time.
Suppose we want to find the probability
distribution of the accidents at the intersection
of Rural and Apache during a one week
period.
The R.V. we are interested in is the number of
accidents.
Poisson Probability Distribution
i. The Poisson Distribution provides a good model for the probability
distribution of the number of rare events that occur in space, time,
and volume where  is the average at which events occur.
ii. Define: A r.v. is said to have a Poisson distribution if the p.m.f of
X is
x e 
P(x) = f(x) =
, x = 0,1, 
x!
where  is the rate per unit time or per unit area
E[ X ]  
iii.
V (X )  
Exponential Distribution
Previously, we discussed the Poisson random variable,
which was the number of events occurring in a given
interval. This number was a discrete r.v. and the
probabilities associated with it could be described by the
Poisson Probability Distribution.
Not only is the number of events a r.v., but the waiting
time between event is also a random variable. This r.v. is a
continuous r.v. for it can assume any positive value.
This r.v. is an exponential r.v. which can be described by
the exponential distribution.
Exponential Distribution
  e  x
x  0&  0
i. Pdf: f ( x)  
otherwise
 0
where  = rate at which events occur
ii. Correspondingly,
x
F ( x)  P ( X  x)    e x dx  1  e x , x  0
0
1
V (X )  2
E[ X ] 
iii. An important application of the exponential distribution is to
model the distribution of component lifetime. A reason for its
popularity is because of the memory-less property of the
Exponential Distribution
The Uniform Distribution
o The simplest distribution is the one in which a continuous r.v. can assume
any value within a interval [a, b]
 Def:
A continuous r.v. X is said to have a uniform distribution on the
interval [a,b] if the probability distribution (pdf) of X is:
 1
a xb
f ( x)   b  a
 0
otherwise 
The Uniform Distribution
The cumulative distribution is
x
F ( X )  P ( X  x) 
 f ( x)dx
x x
x
a
xa
  f ( x)dx 
ba a ba ba ba
1
ba
E[ X ]   xf ( x)dx   x(
)dx 
ba
2
(b  a ) 2
V (X ) 
12
The Uniform Distribution
Note:
An important uniform distribution is
that for when a = 0 and b = 1, namely
U(0, 1)
A U(0,1) r.v. can be used to simulate
observation of other random variables
of the discrete and continuous type.
The Triangular Distribution
 Continuous Distribution
2( x  a )
f ( x) 
a xb
(b  a )(c  a )
2(c  x)
bxc
(c  b)(c  a )
0
elsewhere
The Triangular Distribution
F ( x)  0
xa
( x  a) 2
F ( x) 
(b  a )(c  a )
(c  x ) 2
1 
(c  b)(c  a )
1
xc
a xb
bxc
The Triangular Distribution
F ( x)  0
xa
abc
E ( x) 
3
a 2  b 2  c 2  ab  ac  bc
V ( x) 
18
a  min{x1  xn }
c  max{x1  xn }
b  3 x  a  c
Normal Distribution
It is a fact that measurements on many random variables will follow a bellshaped distribution.
Random variable of this type are closely approximated by a Normal
Probability Distribution.
A continuous r.v. X is said to have a normal distribution if the pdf of X is
f ( x) 
1
2 
( x  )2
2 2
,   0,  x  ,    
The distribution contains 2 parameters ( and ). These are the expected
value and the variance and hence locate the center of the distribution and
measure its spread.
Normal Distribution
The Standard Normal Distribution
To compute P(a  x b) when X ~ N(, 2), we must evaluate
b
 f ( x)dx  
a
1
2 
( x )2
2 2
dx
Note: None of the standard integration techniques can be used
to evaluate this pdf. Instead, for  = 0, and 2 = 1, the pdf has
been evaluated and values have been computed. Using the
table, probabilities for any other values of  and 2 can be
determined
Normal Distribution
The normal distribution for parameters values
2
=
0,
and
 = 1 is called the standard normal
distribution. A r.v. that has a standard
distribution is called a standard normal random
variable (denoted by Z). The pdf of Z is:
f ( z) 
1
2
z2
Normal Distribution
The cumulative distribution of Z is
z
P( Z  z ) 
 f ( y)dy
and is denoted by  (Z)
Note: The N(0,1) Table returns the cumulative
probability up to z or (z)
Normal Distribution
Non-standard Normal Distribution
The table only provides probabilities for r.v.
following the N(0,1) distribution. Thus, when X
2
2
~ N(,  ), (i.e. not  = 0,  = 1), probabilities
involving X are computed by standardizing
the r.v. to N(0,1) scale.
Selecting a Distribution
 Theoretical prior knowledge
 Random arrival => exponential IAT
 Sum of large manufactures => Normal CLT
 Compare histogram with probability mass
or probability density
Data Collection
 Little variability model as a constant.
 Variability model as a random variable.
 Empirical vs. Theoretical, Select a
Distribution, Estimate Parameter of
distribution, goodness or fit test.
X2 goodness of fit test
 Compare observed versus theoretical
density
 A collection of data can be as a sample
from a specified p.d.f
 H0: Xis are IID r.v. with density f(x)
 H1: Xis are not IID r.v. with density f(x)
X2 goodness of fit test
 Critical value
 If H0 is true, TS ~ X2k-1-(# of par estimated), 
 A large T.S.would cause rejection of H0
 Reject Ho if T.S. > X2 critical
i
i
TS  
k
i 1
X2 goodness of fit test
Issues  test is an art
Number of intervals > 2
Size of intervals: Ei ~ same > 5
Requires relatively large amount of data
K-S test
 Compare observed with theoretical CDF
 Limited to continuous distribution, known
parameters
 H0: Xi are IID r.v. with CDF F(x)
 H1: Xi are not IID r.v. with CDF F(x)
 Test statistic  From table
K-S test
 Critical value
 A large T.S would cause rejection
 Critical value   0.01
1.63 / n
  0.05
1.36 / n
  0.10
1.22 / n
i 1
i ^
TS  max  max( F ( Xi ) 
), max(  F ( Xi ))
n
n
Parameter estimation
 Set of data
x
 xi
n
x1, x2,  xm
s 
2
2
2
x
n
x
 i
n 1
 Methods of moments => equate E(X),
V(X) to x and S2
Parameter estimation
 Maximum likelihood => find parameter
that max the likelihood of obtaining the
given sample
 Produces efficient and consistent estimates
 Not always unbiased
 Superior properties to methods of moments
 Common sense.
Statistical Analysis of Simulations
 As previously mentioned, output data from
simulation always exhibit random variability, since
random variables are input to the simulation model.
 We must utilize statistical methods to analyze
output from simulations.
 The overall measure of variability is generally
stated in the form of a confidence interval at a given
level of confidence.
 Thus, the purpose of the statistical analysis is to
estimate this confidence interval.
Output analysis
 Need multiple observations to estimate
variability
 Y1, Y2, Y3, . Yn
 Estimate a confidence interval for the
measure of performance
 Estimate the number of observations
required to obtain the desired precision
Output analysis
 What is an observation?
 Is observation a sample statistic or time
average statistic?
 Is this a steady state simulation or
terminating simulation?
 Are the observations independent or
correlated?
Terminating vs Steady State Simulation
 Often, the type of model determines which
type of output analysis is appropriate for a
particular simulation.
 However, the system or model may not
always be the best indicator of which
simulation would be the most appropriate.
 It is quite possible to use the terminating
simulation approach for systems more
suited to steady-state simulations, and vice
versa.
Observation vs Time Based
 Observation (Sample)
 Average Time In System
 Average Time In Queue
 Time Based
 Average Number in System
 Average Number in Queue
 Machine Utilization
Terminating simulation
 Simulation in which the output measure of
performance is defined over a specific
interval of time with a specific starting
condition and a specific ending condition
Retail sales during a business day
Project network
Time to produce a batch of parts in a work cell
Military Simulations
Terminating simulation
 Has a specified starting and ending
condition.
 Each observation must have the same
starting and ending.
 Observations are obtained by replication.
Use a different seed for random number
generation.
Steady state simulation
 Simulation in which the output measure of
performance is defined over an infinite
interval of time independent of the initial
state of the system and stopping condition
 Average production from an assembly line of
well trained employees
 Inventory simulation
Steady state simulation
 Independent of starting and ending
condition.
 Remove initial condition bias
 Specify warm-up period (transient period) .
 Set initial condition too steady state.
 Have a very long run length
Steady state simulation
 1. Individual Yi average of individuals.
 2. Replication Yi average of each one.
 3. Batch means batch by time, by number.
Terminating vs. Steady state simulation
 Terminating
 Observations are obtained by replication
 Each observation must reflect the specified
starting and ending condition
 Use a different seed for each replication
 Y1, Y2, , Yr => one independent
observation per replication
Confidence interval for steady state
simulation
 Y1, Y2, . Yn
 Trying to estimate a long run performance
measure independent of starting and
ending conditions
 Two problems
 Initial condition bias
 Dependent observations
Confidence interval for steady state
simulation
 Outline
 Removing initial condition bias
 Creating independent observation
 Replication/ deletion
 Batch means
Confidence interval for replication
 Let Y1, Y2, and Y3YR be measures of
performance from R independent
replication.
 Independent -> different seed for each run
Y   t r 1 ,
 (Y
)  RY
R 1
Confidence interval for replication
 Approximate due to need for Yi ~ Normal
 (1-) Confidence Interval => Probability
of containing true mean
1
Var (Y )  Var ( R
Y1  ... 
1
2
1
R
YR )
(Var (Y1 )  ...  Var (YR ))
R
RS 2
S2
2
R
R
Number of replication needed
 Suppose we desire a confidence interval
Y  I  HalfLength
 Based on a preliminary run of R0
replication, we have an estimate of S2 and
confidence interval
Y t
,
1
R0 1
2
S
R0
Number of replication needed
 Find R such that
I t
,
1
R 1
2
S2
R
 If R is large,
r 1
 R
R
*S 
Test for comparing two means
H0: 1  2 = 0
H1: 1  2  0
Two approaches:
 Form a (1  ) confident on 1  2 :
Y1  Y2  t / 2,r V (Y1  Y2 )
Reject H0 if confident does not contain 0.
 Perform a t test
(Y1  Y2 )  0
V (Y1  Y2 )
Reject if \t\ > tr,/2
Assumptions
Case 1: Y1, Y2 YR1 
Case 2: Y1, Y2 YR2 
Y1 , s12
Y2 , s 22
Observations are independent
Observation are normally distributed
Variances are unknown/known.
Variances are equal/unequal
Observations are paired/unpaired.
Test for comparing two means
Equal Variance
1. Assumptions: independent, normal, unknown, unpaired, equal
variance.
2
2
( R1  1) S12  ( R2  1) S 22
 (Yi  Y1 )   (Yi  Y2 )
2
2. S p 
R1  R2  2
R1  R2  2
3. Var (Y1  Y2 )  Var (Y1 )  Var (Y2 ) 
S p2
R1
4. (1   )confident : Y1  Y2  t / 2, R1  R2  2
5. t-test: t 
t-crit = t
S p2
S p2
R1
R2
S p2
R2
( y1  y 2 )
Sp
1 1
R1 R2
R1  R2  2 ,
6. Note: Many simulations do not have equal variance.
Test for comparing two means
One sided test
Need to make hypothesis in advance
Use t test, adjust critical value
Test for comparing two means
 Test for normal population with known variance
 Assumptions: independent, normal, known variance,
unpaired, unequal variance.
 2 populations: X1 ~ N(1, 12) & X2 ~ N(2, 22)
 Sample m from X1 & sample n from X2
 Want to test whether 1= 2
 H0: 1 = 2
H1: 1  2
X 1  X 2  ( 1  1 ) X 1  X 2
 Test Statistic: Z 0 
2
2
2
2
1
 2
m
n
1  2
m
n
Test for comparing two means
Unequal Variance
1. Assumptions: independent, normal, unknown variance,
unpaired, unequal variance.
S12 S 22
2. Var (Y1  Y2 )  Var (Y1 )  Var (Y2 ) 
R1 R2
3. (1   )confident : Y1  Y2  t / 2,
 S2 S2
 1
2
 R R
 1
2
 S2 
 1 
 R 
 1
R1 1
 S2 
 2 
 R 
 2
2 1
S12 S 22
R1 R2
Test for comparing two means
 Paired Test
 Assumptions: independent, normal, unknown variance,
equal # of replications
 Case 1: Y1, Y2 YR
Case 2: Y1, Y2 YR
Different: d1, d2 dR , where di = yi  yi
2
 di
 (d i  d )
2
d 
Sd 
R
R 1
 H0: 1  2 = 0  d = 0
H1: 1  2  0  d  0
 (1   )confident : d  t / 2, R 1
 t 
d
S d2
R
S d2
V (d ) 
R
Test for comparing two variances
F-test for equal variance
1.
H 0   12   22
H 1   12   22
2. Test statistics = F =
3. Critical Value =
S12
S 22
R1 1, R2 1,
4. Example
F =5.4/2.55 = 2.12
= .10, Fcritical = F9,9,.05 = 3.18, can not reject Ho
Common Random Number
 The process of comparing cases with the
same set of random numbers
 creating identical condition
 Observation
 Confident Interval
(Y1  Y2 )  t R 1, / 2
V (Y1  Y2 )
V (Y1  Y2 )  V (Y1 )  V (Y2 )  2Cov (Y1 , Y2 )
 Use the paired test
Random Numbers
 Generation of U(0,1) random number 
algorithm used by the RND function
 Generation of random variates from
various distributions  algorithm used by
EXPONENTIAL, UNIFORM, and so on
(these algorithms use U(0,1) random
numbers.
Random Number Generation
 Desirable properties
Fast and efficient
Capable of repeating same sequence
Statistically equivalent to U(0,1)
Independent and dense
Large cycle length or period
Low storage requirements
 Old method  tables
Random Number Generation
 Pseudo random number generators
 A non random sequence of numbers each
completely determined by its predecessor, the
algorithm, and initially, the seed.
Linear Congruential Generator
 Zi = ( a * Zi-1 + C ) mod m
 Z0 = seed
 Ui = Zi / m (Random Number)
 If we choose a, C, and m correctly, => then
we achieve a maximum period
 0<= Zi <= m-1
Linear Congruential Generator
 Rule For Full Period :
 C is relatively prime to m.
 other than 1, hence there is no integer that exactly
divides C and m
 Every prime factor of M is also a prime factor
of A-1
 If m is exactly dividable by 4, then A-1 must
be exactly dividable by 4
Linear Congruential Generator
 A full period does NOT mean always a
good random number generator
Multiplicative Generators
 Zi = a * Zi-1 mod m
 Z0 = seed
 Saves an addition, more popular
Multiplicative Generators
 C=0
M divides both m and c
Condition (a) is violated
Not full period
P = m  1 is largest available period
Multiplicative Generators
 2b is not a good choice for m
only  possible numbers
 Let m = 2b - 1
Testing a random number generator
 Testing the distribution
 Generate 1000 or more observations
 X2 test or K-S test for U(0, 1)
 Use 100 intervals
 Test for independence
 Runs up
 Tests designed to compare observed and
expected distribution
 E(x) = .5 V(X) = 1/12, where a = 0, b=1
Random variate generation
 Assume a random number generator is
available to generate Ui ~ U(0, 1)
 Goal: Generate Xi from a specified
distribution f(x) or p(x) of F(x)
 Three methods
 Inverse transformation method
 Convolution method
 Acceptance\Rejection method
Random variate generation
 Apply these methods to the five
distributions we are using in this class
Uniform
Triangular
Exponential
Normal
Poisson
Inverse transformation method
 General idea  use CDF
 Select Ui
 Find corresponding xi
 That is xi = F-1(Ui)
 Advantage of inverse transformation method
 One Ui per xi
 Disadvantage
 CDF may not always exist
Inverse transformation method
 Exponential distribution
 f(x) = e -x x  0
 F(X) = 1 - e -x x  0
 Ui = F(Xi) = 1 - e -xi
 (1- Ui) = e -xi
 ln(1- Ui) = - Xi
 Xi = - (1/ ) ln(1- Ui) = - (1/ ) ln(Ui)
Inverse transformation method
 Triangular distribution
( x a )2
,a  x  b
( x )  ( b  a )( c  a )
ui
( c  x )2
1
,b  x  c
( c b )( c  a )
No
Yes
u
i
x a
i
ba
ca
(b  a)(c  a) ui
ba
ui c a
ba
ui ca
x c
i
(c  b)(c  a)(1  u i)
Convolution Method
 Applicable to situation where the random
variable of interest can be expressed as a
sum of other random variables that are IID
(independent identical distributed)
X=Y1+Y2+Y3. +Yn
 Idea: Generate Y1. Yn and add these up
to calculate X
Convolution Method
 Normal distribution
 Focus: Generating Zi ~ N(0, 1)
xi  
Zi 
 xi    Z i ~ N (  ,  )
 Generating Zi
1
f (Z ) 
e
2
1 2
z
2
 Inverse transformation: F(x) does not exist
 Acceptance\Rejection: Not bounded
Convolution Method
 Normal distribution
Generate Ui
Generate Zi
Then Xi    Zi
Zi~N(0,1)
Acceptance\Rejection Method
 Applicable to distribution functions that
are hard to integrate
 Idea
 Find a majoring function t(x) where t(x) > f(x)
 Sample values of x from t(x) call it x*
 Sample Ui < f(x*) / t(x*), accept x*
 Simplification for this class  we will
always use a rectangular majoring function
9.3 Random Numbers and Monte
Carol Simulation
 The procedure of generating these times from the
given probability distributions is known as
sampling from probability distributions, or
random variate generation, or Monte Carlo
sampling.
 We will discuss several different methods of
sampling from discrete distributions.
 The principle of sampling from discrete
distributions is based on the frequency
interpretation of probability.
 In addition to obtaining the right frequencies, the
sampling procedure should be independent; that is,
each generated service time should be independent
of the service times that precede it and follow it.
 This procedure of segmentation and using a roulette
wheel is equivalent to generating integer random
numbers between 00 and 99.
 This follows from the fact that each random number
in a sequence has an equal probability of showing
up, and each random number is independent of the
numbers that precede and follow it.
 A random number, Ri, is defined as an
independent random sample drawn from a
continuous uniform distribution whose
 1 0function
 x  1 (pdf) is given
probability
density
f ( x)  
 0 otherwise
by
Random Number Generators
 Since our interest in random numbers is for use
within simulations, we need to be able to generate
them on a computer.
 This is done by using mathematical functions called
random number generators.
 Most random number generators use some form of a
congruential relationships. Examples of such
generators include linear congruential generator, the
multiplicative generator, and the mixed generator.
 The lineal congruential generator is by far the most
widely used.
Each random number generated using this methods
will be a decimal number between 0 and 1.
Random numbers generated using congruential
methods are called pseudorandom numbers.
Random number generators must have these
important characteristics:
1.
2.
3.
4.
The routine must be fast
The routine should not require a lot of core storage
The random numbers should be replicable; and
The routine should have a sufficiently long cycle
 Most programming languages have built-in
library functions that provide random (or
pseudorandom) numbers directly.
Computer Generation of Random
Numbers
 We now take the method of Monte Carlo sampling
a stage further and develop a procedure using
random numbers generated on a computer.
 The idea is to transform the U(0,1) random
numbers into integer random numbers between 00
and 99 and then to use these integer random
numbers to achieve the segmentation by numbers.
 We now formalize this procedure and use it to
generate random variates for a discrete random
variable.
 The procedure consists of two steps:
1. We develop the cumulative probability
distribution (cdf) for the given random
variable, and
2. We use the cdf to allocate the integer random
numbers directly to the various values of the
random variables.
9.4 An Example of Monte Carlo
Simulation
 The book uses a Monte Carlo simulation to
simulate a news vendor problem.
 The procedure in this simulation is different from
the queuing simulation, in that the present
simulation does not evolve over time in the same
way.
 Here, every day is an independent simulation.
Such simulations are commonly referred to as
Monte Carlo simulations.
9.5 Simulations with Continuous
Random Variables
 In many simulations, it is more realistic and
practical to use continuous random variables.
 We present and discuss several procedures for
generating random variates from continuous
distributions.
 The basic principle is similar to the discrete case.
 We first generate U(0,1) random number and then
transform it into a random variate from the
specified distribution.
 The selection of a particular algorithm will
depend on the distribution from which we want
to generate, taking into account such factors as
the exactness of the random variables, the
computations and storage efficiencies, and the
complexity of the algorithm.
 The two most common used algorithms are the
inverse transformation method (ITM) and the
acceptance-rejection method (ARM).
Inverse Transformation Method
 The inverse transformation method is generally used
for distribution whose cumulative distribution
function can be obtained in closed form.
 Examples include the exponential, the uniform, the
triangular, and the Weibull distributions.
 For distributions whose cdf does not exist in closed
form, it may be possible to use some numerical
method, such as a power-series expansion, within
the algorithm to evaluate the cdf.
 The ITM is relatively easy to describe and
execute.
 It consists of the following steps:
 Step1: Given the probability density formula f(x) for a
random variable X, obtain
the cumulative distribution
x
function F(x)Fas
( x) 
f (t )dt
 Step 2: Generate a random number r.
 Step 3: Set F(x) = r and solve for x.
 We consider the distribution given by the function
 x
 2
f ( x)  
 0
0x2
otherwise
 A function of this type is called a ramp function.
 To obtain random variates from the distribution
using the inverse transformation method, we first
computer the cdf as
x t
F ( x)   dt
0 2
x2
 In Step 2, we generate a random number r.
 Finally, in Step 3, we set F(x) =r and solve for x.
x2
r
4
x  2 r
 Since the service time are defined only for positive
values of x, a service time of
as the solution
for x. This equation is called a random variate
generator or a process generator.
 Thus, to obtain a service time, we
x  2first
r generate a
random number and then transform it using the
preceding equation.
 As this example shows, the major
advantage of the inverse transformation
method is its simplicity and ease of
application.
Acceptance  Rejection Method
 There are several important distributions,
including the Erlang (used in queuing models) and
the beta (used in PERT), whose cumulative
distribution functions do not exist in closed form.
 For these distributions, we must resort to other
methods of generating random variates, one of
which is the acceptance  rejection method
(ARM).
 This method is generally used for distributions
whose domains are defined over finite intervals.
 Given a distribution whose pdf, f(x), is defined
over the interval a  x  b, the algorithm consists
of the following steps:
 Step 1: Select a constant M such that M is the largest
value of f(x) over the interval [a, b].
 Step 2: Generate two random numbers, r1 and r2.
 Step 3: Computer x* = a + (b  a)r1. (This ensures that
each member of [a, b] has an equal chance to be chosen
as x*.)
 Step 4: Evaluate the function f(x) at the point x*. Let
this be f(x*).
 Step 5: If
r2 
f ( x*)
M
deliver x* as a random variate from the distribution whose
pdf is f(x). Otherwise, reject x* and go back to Step 2.
 Note that the algorithm continues looping back to
Step 2 until a random variate is accepted.
 This may take several iterations. For this reason, the
algorithm can be relatively inefficient.
 The efficiency, however, is highly dependent on the
shape of the distribution.
 There are several ways by which the method can
be made more efficient.
 One of these is to use a function in Step 1 instead
of a constant.
 We now give an intuitive justification of the
validity of the ARM.
 In particular, we want to show that the ARM does
generate observations from the given random
variable X.
Direct and Convolution Methods for
the Normal Distribution
 Both the inverse transformation method and the
acceptance  reject method are inappropriate for
the normal distribution, because (1) the cdf does
not equal in closed form and (2) the distribution
is not defined over a finite interval.
 Other methods such as  an algorithm based on
convolution techniques, and then a direct
transformation algorithm that produces two
standard normal variates with mean 0 and
variance 1.
The Convolution Algorithm
 In the convolution algorithm, we make direct use of
the Central Limit Theorem.
 The Central Limit Theorem states that the sum Y of
n independent and identically distributed random
variables ( say Y1, Y2,Yn), each with mean  and
finite variance 2) is approximately normally
distributed with mean n and variance n2.
 If we want to generate a normal variate X with
mean  and variance 2, we first generate Z using
this process generator then transform it using the
relation X =  + Z. Unique to normal distribution.
The Direct Method
 The direct methods for the normal distribution
was developed by Box and Muller (1958).
 Its not as efficient as some of the newer
techniques, it is easy to apply and execute.
 The algorithm generates two U(0,1) random
numbers, r1 and r2, and then transforms them into
two normal variates, each with mean 0 and
variance 1, using the direct transformation.
1
2
Z1  (2 ln r1 ) sin 2r2
1
2
Z2  (2 ln r1 ) cos 2r2
 It is easy to transform these standardized normal
variates intro normal variates X1 and X2 from the
distribution with mean  and variance 2, using
the equations
X1    Z1
X2    Z2
9.6 An Example of a Stochastic
Simulation
 Cabot Inc. is a large mail order firm in Chicago.
 Orders arrive into the warehouse via telephones. At
present, Cabot maintains 10 operators on-line 24
hours a day.
 The operators take the orders and feed them directly
into a central computer, using terminals.
 Each operator has one terminal. At present, the
company has a total of 11 terminals.
 That is, if all terminals are working, there will be 1
spare terminal.
 Cabot managers believe that the terminal system
needs evaluation, because the downtime of operators
due to broken terminals has been excessive.
 They feel that the problem can be solved by the
purchase of some additional terminals for the spares
pool.
 It has been determined that a new terminal will cost
a total of $75 per week.
 It has also been determined that the cost of terminal
downtime, in terms of delays, lost orders, and so on
is $1000 per week.
 Given this information, the Cabot managers would like
to determine how many additional terminals they
should purchase.
 This model is a version of the machine repair problem.
 It is easy to find an analytical solution to the problem
using the birth-death processes.
 However, in analyzing the historical data for the
terminals, it has been determined that although the
breakdown times can be represented by the exponential
distribution, the repair times can be adequately
represented only by the exponential distribution.
 This implies that analytical methods cannot be used
and that we must use simulation.
 To simulate this system, we first require the
parameters of both the distributions.
 The data show that the breakdown rate is
exponential and equal to 1 per week per terminal.
 In other words, the time breakdowns for a terminal
is exponential with a mean equal to 1 week.
 Analysis for the repair times shows that this
distribution can be represented by the triangular
distribution which has a mean of 0.075 week.
  10  400 x 0.025  x  0.075
f ( x)  
 50  400 x 0.075  x  0.125
 The repair stuff on average can repair 13.33
terminals per week.
 To find the optimal number of terminals, we must
balance the cost of the additional terminals against
the increased revenues generated as a result of the
increase in the number of terminals.
 In this simulation we increase the number of
terminals in the system, n, from the present total of
11 in increments of 1.
 For this fixed value of n, we then run our simulation
model to estimate the net revenue.
 Net revenue here is defined as the difference
between the increase in revenues due to the
additional terminals and the cost of these additional
terminals.
 We keep on adding terminals until the net revenue
position reaches a peak.
 To calculate the net revenue, we first computer the
average number of on-line terminals, ELn, for a
fixed number of terminals in the system, n.
 Once we have a value of ELn, we can
computer the expected weekly downtime
costs, given by 1000(10-ELn).
 Then the increase in revenue as a result of
increasing the number of terminals from 11
m
to n is 1000(EL
T n  EL11). Mathematically,
Ai
N
(
t
)
dt
0
ELn EL
 i 1
we compute
n
T
where
T = length of simulation
N(t) = number of terminals on-line at time t (0tT)
Ai = area of rectangle under N(t) between ei-1 and ei
(where ei is the time of the ith event)
m = number of events that occur in the interval [0,T]
 Between time 0 and time e1, the time of the first
event, the total on-line time for all the terminals is
given by 10ei, since each terminal is on-line for a
period of e1 time units.
 If we now run this simulation over T time units and
sum up the areas A1, A2, A3,, we can get an
estimate for EL10 by dividing this sum by T. This
statistic is called a time-average statistic.
 We would like to set up the process in such way that
it will be possible to collect the statistics to
computer the areas A1, A2, A3,.
 That is, as we move from event to event, we would
like to keep track of at least the number of terminals
on-line between the events and the time between
events.
 We first define the state of the system as the
number of terminals in the repair facility.
 The only time the state of the system will change is
when there is either a breakdown or a completion
of a repair.
 Therefore, there are two events in this simulation:
breakdown and completion of repairs.
 To set up the simulation, our first task is to
determine the process generators for both the
breakdown and the repair times.
 We use the ITM to develop the process generators.
 For the exponential distribution the process
generator is simply x = -log r
 In case of the repair times, applying the ITM gives
us
x  0.025  0.005r (0  r  0.5)
and
x  0.125  0.005(1  r ) (0.5  r  1.0)
as the process generators.
 For each n, we start the simulation in the state
where there are no terminals in the repair facility.
 In this state, all 10 operators are on-line and any
remaining terminals are in the spares pool.
 Our first action is the simulation is to schedule the
first series of events, the breakdown times for the
terminals presently on-line.
 Having scheduled these events, we next determine
the first event, the first breakdown, by searching
through the current event list.
We then move the simulation clock to the time
of this event and process this breakdown.
To process a breakdown, we take two separate
series of actions
1. Determine whether a spare is available.
2. Determine whether the repair staff is idle.
These actions are summarized in the system
flow diagram showed in the book in Figure 17.
Otherwise, we process a completion of a repair.
To process the completion of a repair, we also
undertake two series of actions.
1. At the completion of a repair, we have an additional
working terminal, so we determine whether the terminal
goes directly to an operator or to the spares pool.
2. We check the repair queue to see whether any terminals
are waiting to be repaired.
We proceed with the simulation by moving from
event to event until the termination time T.
At this time, we calculate all the relevant measures
of performance from the statistical counters.
 Our key measure is the net revenue for the current
value of n.
 If this revenue is greater than the revenue for a
system with n-1 terminals, we increase the value of
n by 1 and repeat the simulation with n +1
terminals in the system.
 Otherwise, the net revenue has reached a peak.
 The simulation outlined in this example can be
used to analyze other policy options that
management may have.
 The simulation model provides a very
flexible mechanism for evaluating
alternative policies.