1. Course Notes for
CS 1538
Introduction to Simulation
By
John C. Ramirez
Department of Computer Science
University of Pittsburgh
2. 2
• These notes are intended for use by students in CS1538 at the
University of Pittsburgh and no one else
• These notes are provided free of charge and may not be sold in any
shape or form
• These notes are NOT a substitute for material covered during course
lectures. If you miss a lecture, you should definitely obtain both these
notes and notes written by a student who attended the lecture.
• Material from these notes is obtained from various sources, including,
but not limited to, the following:
Discrete-Event System Simulation, Fifth Edition by Banks, Carson, Nelson
and Nicol (Prentice Hall)
• Also (same title and authors) Third and Fourth Editions
Object-Oriented Discrete-Event Simulation with Java by Garrido (Kluwer
Academic/Plenum Publishers)
Simulation Modeling and Analysis, Third Edition by Law and Kelton
(McGraw Hill)
A First Course in Monte Carlo by George S. Fishman (Thomson & Brooks/
Cole
3. 3
Goals of Course
• To understand the basics of computer
simulation, including:
Simulation concepts and terminology
When it is useful
Why it is useful
How to approach a simulation
How to develop / run a simulation
How to interpret / analyze the results
4. 4
Goals of Course
• To understand and utilize some of the
mathematics required in simulations
Statistical models and probability distributions
• How various models are defined
• Which models are correct for which situations
Simple queuing theory
• Characteristics
• Performance measures
• Markovian models
5. 5
Goals of Course
Random number theory
• Generating and testing pseudo-random numbers
• Generating pseudo-random values within various
distributions
Analysis / generation of input data
• How is input data generated?
• Is the data correct and appropriate for the
simulation?
Analysis / measurement of output data
• What does the output data mean and what can be
derived from it?
• How confident are we in our results?
6. 6
Goals of Course
• To implement some simulation tools and
some simulation projects
What enhancements do typical programming
languages need to facilitate simulation?
Programming will be done in Java
• Review if you are rusty
• Find / keep a good Java reference
• There are special-purpose simulation languages, but
we will probably not be using them
7. 7
Introduction to Simulation
• What is simulation?
Banks, et al:
• "A simulation is the imitation of the operation of a
real-world process or system over time". It "involves
the generation of an artificial history of a system,
and the observation of that artificial history to draw
inferences … "
Law & Kelton:
• "In a simulation we use a computer to evaluate a
model (of a system) numerically, and data are
gathered in order to estimate the desired true
characteristics of the model"
8. 8
Introduction to Simulation
More specifically (but still superficially)
• We develop a model of some real-world system that
(we hope) represents the essential characteristics of
that system
– Does not need to exactly represent the system – just
the relevant parts
• We use a program (usually) to test / analyze that
model
– Carefully choosing input and output
• We use the results of the program to make some
deductions about the real-world system
http://en.wikipedia.org/wiki/Computer_simulation
• Some interesting info here
9. 9
Introduction to Simulation
• Why (or when) do we use simulation?
• This is fairly intuitive
Consider arbitrary large system X
• Could be a computer system, a highway, a factory, a
space probe, etc.
We'd like to evaluate X under different
conditions
• Option 1: Build system X and generate the
conditions, then examine the results
– This is not always feasible for many reasons:
> X may be difficult to build
> X may be expensive to build
10. 10
Introduction to Simulation
> We may not want to build X unless it is "worthwhile"
> The conditions that we are testing may be difficult or
expensive to generate for the real system
• For example:
– A company needs to increase its production and needs
to decide whether it should build a new plant or it
should try to increase production in the plants it
already has
> Which option is more cost-effective for the company?
– Clearly, building the new plant would be very
expensive and would not be desirable to do unless it is
the more cost-effective solution
– But how can we know this unless we have built the
new plant?
11. 11
Introduction to Simulation
• Another (ongoing) example:
– NASA wants to know if damage on the Space Shuttle
will threaten it upon re-entry
– If they wait until re-entry to make a judgment, it is
already too late
– In this case it is not feasible to do the real-world test
12. 12
Introduction to Simulation
• Option 2: Model system X, simulate the conditions
and use the simulation results to decide
– Continuing with the same first example:
– Model both possibilities for increasing production and
simulate them both
> We then choose the solution that is most economically
feasible
– Continuing with the Space Shuttle example
– Model the damage and the stress that re-entry
imparts on the shuttle
> Determine via a simulation if the damage will threaten
the shuttle or not
> Note the importance of being correct here
13. 13
Introduction to Simulation
• Clearly, this is itself not a trivial task
– Simulations are often large, complex and difficult to
develop
– Just developing the correct system model can be a
daunting task
> There are many variables that must be taken into
account
– However, if a new plant costs hundreds of millions or
even billions of dollars, spending on the order of
thousands (or even hundreds of thousands) of dollars
on a simulation could be a bargain
– Note that with the Shuttle example, most of the work
for this must be done in advance
> Don't have time to design & implement this during the
duration of a flight
14. 14
Introduction to Simulation
When is simulation NOT a good idea?
– See Section 1.2 of Banks text
– We will look at some of the guidelines now
• Don't use a simulation when the problem can be
solved in a "simpler" or more exact way
– Some things that we think may have to be simulated
can be solved analytically
– Ex: Given N rolls of a fair pair of dice, what are the
relative expected frequencies of each of the possible
values {2, 3, 4, … 12} ?
> We could certainly simulate this, "rolling" the dice N
times and counting
> However, based on the probability of each possible
result, we can derive a more exact answer analytically
15. 15
Introduction to Simulation
> How many ways do we have of obtaining each
outcome?
2:1, 3:2, 4:3, 5:4, 6:5, 7:6, 8:5, 9:4, 10:3, 11:2, 12:1
Total of 36 possible outcomes
For N "rolls", the expected frequency of value i is
N * (Pi) = N * (outcomes yielding i / total outcomes)
> For example, for 900 rolls, the expected number of 9s
generated would be 900 * (4 / 36) = 100
> Note that the expected value may not be a whole
number (nor should it necessarily be)
> Given 500 rolls, the expected number of 9s is
500 * (4 / 36) 55.55
– Note: You should be familiar with the general
approach above from CS 0441
> We will be looking at some more complex analytical
models later on
16. 16
Introduction to Simulation
• Don't use a simulation if it is easier or cheaper to
experiment directly on a real system
– Ex: A 24 hour supermarket manager wants to know
how to best handle the cash register during the
"midnight shift":
> Have one cashier at all times
> Have two cashiers at all times
> Have one cashier at all times, and a second cashier
available (but only working as cashier if the line gets
too long)
– Each of these can be done during operating hours
> An extra employee can be used to keep track of queue
data (and would not be too expensive)
> Differences are (likely) not that drastic so that
customers will be alienated
17. 17
Introduction to Simulation
• Don't use a simulation if the system is too complex
to model correctly / accurately
– This is often not obvious
– Can depend on cost and alternatives as well
> However, a bad model may not be helpful and could
actually be harmful
> Ex: With the Space Shuttle, lives were at risk – if the
model predicts incorrectly the results are catastrophic
18. 18
Some Definitions
• System
"A group of objects that are joined together in
some regular interaction or interdependence
toward the accomplishment of some purpose"
(Banks et al)
• Note that this is a very general definition
• We will represent this system in our simulation using
variables (objects) and operations
The state of a system is the variables (and their
values) at one instance in time
19. 19
Some Definitions
• Discrete vs. Continuous Systems
Discrete System
• State variables change at discrete points in time
– Ex: Number of students in CS 1538
> When a registration or add is completed, number of
students increases, and when a drop is completed,
number of students decreases
Continuous System
• State variables change continuously over time
– Ex: Volume of CO2 in the atmosphere
> CO2 is being generated via people (breathing),
industries and natural events and is being consumed by
plants
20. 20
Some Definitions
– Models of continuous systems typically use differential
equations to indicate rate of change of state variables
– Note that if we make the time increment and the unit
of measurement small enough, we may be able to
convert a continuous system into a discrete one
> However, this may not be feasible to do
> Why?
– Also note that systems are not necessarily exclusively
discrete or exclusively continuous
• We will be primarily concerned with Discrete
Systems in this course
21. 21
Some Definitions
• System Components
Entities
• Objects of interest within a system
– Typically "active" in some way
– Ex: Customers, Employees, Devices, Machines, etc
• Contain attributes to store information about them
– Ex: For Customer: items purchased, total bill
• May perform activities while in the system
– Ex: For Customer: shopping, paying bill
– In many cases it is really just the period of time
required to perform the activity
• Note how nicely this meshes with object-oriented
programming
22. 22
Some Definitions
Events
• Instantaneous occurrences that may change the
state of a system
– Note that the event itself does not take any time
– Ex: A customer arrives at a store
– Note that they "may" change the state of the system
> Example of when they would not?
• Endogenous event
– Events occurring within the system
– Ex: Customer moves from shopping to the check-out
• Exogenous event
– Events relating / connecting the system to the outside
– Ex: Customer enters or leaves the store
23. 23
Some Definitions
• System Model
A representation of the system to be used /
studied in place of the actual system
• Allows us to study a system without actually building it
(which, as we discussed previously, could be very
expensive and time-consuming to do)
Physical Model
• A physical representation of the system (often scaled
down) that is actually constructed
– Tests are then run on the model and the results used to
make decisions about the system
– Ex: Development of the "bouncing bomb" in WWII
> http://www.sirbarneswallis.com/Bombs.htm
– Ex: Most things done on Mythbusters
24. 24
Some Definitions
Mathematical Model
• Representing the system using logical and
mathematical relationships
• Simple ex: d = vot + ½ at2
– This equation can be used to predict the distance
traveled by an object at time t
– However, will acceleration always be the same?
• Often this model is fairly complex and defined by the
entities and events
• This is the model we will be using
• However, in order to be useful, the model must be
evaluated in some way
– i.e. The behavior based on the model must be
determined
25. 25
Some Definitions
• Analytical evaluation
– If the model is not too complex we can sometimes
solve it in a closed form using analytical methods
– One type of analytical evaluation is the Markov
process (or Markov chain)
– Nice simple example at:
http://en.wikipedia.org/wiki/Examples_of_Markov_chains
– We will see this more in Section 6.4
– Often problems that are too complex, even if they can
be modeled analytically, are too computation intensive
to be practical
• Simulation evaluation
– More often we need to simulate the behavior of
the model
26. 26
Some Definitions
Deterministic Model
• Inputs to the simulation are known values
– No random variables are used
– Ex: Customer arrivals to a store are monitored over a
period of days and the arrival times are used as input
to the simulation
Stochastic Model
• One or more random variables are used in the
simulation
– Results can only be interpreted as estimates (or
educated guesses) of the true behavior of the system
– Quality of the simulation depends heavily on the
correctness of the random data distribution
> Different situations may require different distributions
27. 27
Some Definitions
– Ex: Customers arrive at a store with exponentially
distributed interarrival times having a mean of 5
minutes
• In most cases we do not know all of the input data
in advance, and at least some random data is
required
– Thus, our simulations will typically use the
stochastic model
28. 28
Some Definitions
Static Model
• Models a system at a single point in time, rather than
over a period of time
• Sometimes called Monte Carlo simulations
– We'll briefly discuss these later (they are interesting
and very useful)
Dynamic Model
• Models a system over time
• Our simulations will typically use this model
• In summary our models will typically be:
discrete, mathematical, stochastic and
dynamic
29. 29
The Clock
• Since we are using the dynamic model, we
need to represent the passage of time
We need to use a clock
Three fundamental approaches to time
progression
• Next-event time advance
– Clock initialized to zero
– As the times of future events are determined, they are
put into the future event list (FEL)
– Clock is advanced to the time of the next most
imminent event, the event is executed and removed
from the list
– See example in Section 3.1.1
30. 30
The Clock
Ex: People (P) using a MAC machine
• Event A == arrival of a customer at MAC machine
• Event C == completion of a transaction by a customer
Clock FEL Event Action
0 (A2,t1), (C1,t2) A1
P1 arrives, is served; Events A2 and
C1 generated, placed in FEL
t1 (C1,t2), (A3,t3) A2
P2 arrives, waits; Event A3 generated,
placed in FEL
t2 (A3,t3), (C2,t4) C1
P1 completes; P2 is served; Event C2
generated, placed in FEL
t3 (A4,t5), (C2,t4) A3
P3 arrives, waits; Event A4 generated,
placed in FEL (note: t5<t4)
t5 (C2,t4), (A5,t6) A4
P4 arrives, waits; Event A5 generated,
placed in FEL
t4 (A5,t6), (C3,t7) C2
P2 completes; P3 is served; Event C3
generated, place in FEL
31. 31
The Clock
• Fixed-increment time advance (activity scanning)
– Clock initialized to zero
– Clock is incremented by a fixed amount (ex. 1)
– With each increment, list of events is checked to see
which should occur (could be none)
– Clock is typically easier to implement in this way
– However, execution is less efficient, esp. if time
between events is large
> Potentially many scans for each event
32. 32
The Clock
• Process-interaction approach
– Entities are associated with processes
– Processes interact as entities progress through system
– Could delay while waiting for a resource, or during an
interaction with another process
> Can be implemented with multithreading or
multiprocessing
33. 33
Simple Example
• Let's consider a very simple example:
Single-Channel Queue (Example 2.5 in text)
• Small grocery store with a single checkout counter
• Customers arrive at the checkout at random between
1 and 8 minutes apart (uniform)
• Service times at the counter vary from 1 to 6
minutes
– P(1) = 0.1, P(2) = 0.2, P(3) = 0.3, P(4) = 0.25
P(5) = 0.1, P(6) = 0.05
• Start with first customer arriving at time 0
• Run for a given number of customers (text uses 100)
• Calculate some results that may be useful
34. 34
Simple Example
The entities are the customers
The system is discrete since states are changed
at specific points in time
• ex: a customer arrives or leaves
The model is mathematical (since we don't
have real customers)
The model is stochastic since we are generating
random arrivals and random service times
The model is dynamic since we are progressing
in time
35. 35
Simple Example
What results are we interested in?
• In this simple case we may want to know
– What fraction of customers have to wait in line
– What is the average amount of time that they wait
– What is the fraction of time the cashier is idle (or
busy)
• We probably want to do several runs and get
cumulative results over the runs (ex: averages)
• There are more complex statistics that may be
relevant
– We will discuss some of these later
36. 36
Simple Example
We can program this example, but in this
simple case we could also use a table or
spreadsheet to obtain our results
• Let's first look at an "Excel novice" approach to this
– See sim1.xls
• Although some of the spreadsheet formulas require
some thought, this is fairly simple to do
• Note that each row in the spreadsheet depends only
on some local data (generated in that row) and the
data in the previous row
– We do not need a "memory" of all rows
• Authors have a much nicer spreadsheet with macros
– See http://www.bcnn.net
37. 37
Programming a Simple Example
• If we do program it, how would we do it?
Using Java, it is logical to do it in an object-
oriented way
Let's think about what is involved
• We need to represent our entities
– As text indicates, for this simple example we do not
have to explicitly represent them
– However, we can do it if we want to – and have our
Customers and CheckOut as simple Java objects
• We need to represent our events
– We need to store events in our Future Event List (FEL)
and we have two different kinds of events (arrival of a
customer, finish of a checkout)
38. 38
Programming a Simple Example
> We need to distinguish between the different event
types (since different actions are taken for different
events)
> We need to order our events based on the simulation
clock time that they will occur
– Thus we probably need to explicitly represent the
events in some way
> Use classes and inheritance to represent the different
events
> This enables events to share characteristics but also to
be distinguished from each other
> So we need a event time instance variable and a
method to compare event times
> Look at SimEvent.java, ArrivalEvent.java,
CompletionEvent.java
39. 39
Priority Queue to Represent the FEL
• We need to represent the FEL itself
– Since we are inserting items and then removing them
based on priority (earliest next time of an event is
removed first), we should use a priority queue (PQ)
with the following operations:
> add (Object e) – add a new Object to the PQ
> remove() – remove and return the Object with the min
(best) priority value
> peek() – return the Object with the min (best) priority
value without removing it
– It's also a good idea to have some helper methods
> size() – how many items are in the PQ
> isEmpty() – is the PQ empty
– There are variations of these ops depending on the
implementation, but the idea is the same
40. 40
Priority Queue to Represent the FEL
– How to efficiently implement a Priority Queue?
> How about an unsorted array or linked list?
> add is easy but remove is hard – why? – discuss
> How about a sorted array or linked list?
> removeMin is easy but add is hard – why? – discuss
– Neither implementation is adequate in terms of
efficiency
> Note that the premise of a PQ is that everything that is
inserted is eventually removed
> Thus, with N adds you have N removes
> Discuss / show on board overall time required for both
implementations
> You may have seen this already in CS 1501
– Thus we need a better approach
> Implementation of choice is the Heap
41. 41
Heap Implementation of a Priority Queue
– Idea of a Heap:
> Store data in a partially ordered complete binary tree
such that the following rule holds for EACH node, V:
Priority(V) betterthan Priority(LChild(V))
Priority(V) betterthan Priority(RChild(V))
> This is called the HEAP PROPERTY
> Note that betterthan here often means smaller
> Note also that there is no ordering of siblings – this is
why the overall ordering is only a partial ordering
– ex: 10
30
40 70
90 45
20
80
35
85
42. 42
Heap Implementation of a Priority Queue
– How to do our operations?
> peek() is easy – return the root
> add() and remove() are not so obvious
> Let's look at them separately
– add(Object e)
> We want to maintain the heap property
> However, we don't know where in advance the new
object will end up
> We also don't want a lot of rearranging or searching if
we can avoid it – remember time is key
> Solution: Add new object at the next open leaf in the
last level of the tree, then push the node UP the tree
until it is in the proper location
> This operation is called upHeap
> See example on board
43. 43
Heap Implementation of a Priority Queue
– remove()
> Clearly, the min node is the root
> However, removing it will disrupt the tree greatly
> How can we solve this problem?
• Remember BST delete?
– Did not actually delete the root, but
rather the _______________ (fill in blank)
• We will do a similar thing with our Heap
– Copy the last leaf to the root and delete
(easily) the leaf node
– Then re-establish the heap property by a
downHeap
– See example on board
44. 44
Heap Implementation of a Priority Queue
– Run-Time?
> Since our tree is complete, it is balanced and thus for N
nodes has a height of ~ lgN
> Thus upHeapand downHeap require no more than ~lgN
time to complete
> Thus, if we have N adds and N removeMins, our total
run-time will be NlgN
> This is a SIGNIFICANT improvement of the simpler
implementations, especially for a long simulation
> Ex: Compare N2 with NlgN for N = 1M (= 220)
– Note:
> For our simple example, a heap is probably not
necessary, since we have few items in our FEL at any
given time
> However, for more complex simulations, with many
different event types, a heap is definitely preferable
45. 45
Implementing a Heap
– How to Implement a Heap?
> We could use a linked binary tree, similar to that used
for BST
Will work, but we have overhead associated with
dynamic memory allocation and access
> But note that we are maintaining a complete binary tree
for our heap
> It turns out that we can easily represent a complete
binary tree using an array
We simply must map the tree locations onto the
array indexes in a reasonable / consistent way
– Idea:
> Number nodes row-wise starting at 0 (some
implementations start at 1)
> Use these numbers as index values in the array
46. 46
Implementing a Heap
> Now, for node at index i
> See example on board
– Now we have the benefit of a tree structure with the
speed of an array implementation
• So now should we write the code?
– No! Luckily, in JDK 1.5 a heap-based PriorityQueue
class has been provided!
– It's still a good idea to understand the
implementation, however
– Look at API
Parent(i) = floor((i-1)/2)
LChild(i) = 2i+1
RChild(i) = 2i+2
47. 47
Queue for Waiting Customers
• We need to represent the queue (or line) of
customers waiting at the checkout
– This is a FIFO queue and can simply be implemented
in various ways
> We can use a circular array
> We can use a linked-list
– You should be already familiar with queue
implementations from CS 0445
– In JDK 1.5 Queue is an interface which is
implemented by the LinkedList class
> See API
> Q: Would a similar approach using an ArrayList also be
good?
48. 48
Programming a Simple Example
• We need to represent the clock
– This is fairly easy – we can do it with an integer
> In some cases it might be better to use a double
• We need to implement some activities
– These are actually better defined as the time required
for activities to execute
– Typically interarrival times or service times, either
specified exactly (with deterministic model) or by
probability distributions (with stochastic model)
> In our case, we have the interarrival times of customers
and the time required for checkout, specified by the
distributions shown on pp. 45-46 of the text
> We will discuss various distributions in more detail later
49. 49
Programming a Simple Example
Let's put this all together: GrocerySim.java
• This is a fairly object-oriented implementation, using
newer JDK 1.5 features
Note that there is also a Java version from
authors in Chapter 4
• Look over this one as well
• Does not utilize JDK 1.5 and not quite as object-
oriented
• The author also switches distributions in this
implementation
– Uses an exponential distribution for arrivals
– Uses a normal distribution for service times
> We will look at these later
50. 50
One More Example
• News Dealer's Problem
Example 2.7 in text
Simple inventory problem
• Each day new inventory is produced and used, but is
not carried over to successive days
• Thus, time is more or less removed from this problem
Used where goods are only useful for a short
time
• Ex: newspaper, fresh food
In this case, our goal is to try to optimize our
profit
51. 51
News Dealer's Problem
Specifics of the News Dealer's Problem
• Seller buys N newspapers per day for 0.33 each
• Seller sells newspapers for 0.50 each
• Unused papers are "scrapped' for 0.05 each
• If seller runs out, lost revenue is 0.17 for each not
sold paper
– Text says this is controversial, which is true
– How to predict how many would have been sold?
> Perhaps seller goes home when he/she runs out
> May be a goal to run out every day – easier than
returning the papers for scrap
• See sim2.xls
52. 52
News Dealer's Problem
In fact we do we really need to simulate this
problem at all?
• The data is simple and highly mathematical
• Time is not involved
Let's try to come up with an analytical
solution to this problem
• We have two distributions, the second of which
utilizes the result of the first
• Let's calculate the expected values for random
variables using these distributions
– For a given discrete random variable X, the expected
value,
E(X) = Sum [xi p(xi)] (more soon in Chapter 5)
all i
53. 53
News Dealer's Problem
• Let our random variable, X, be the number of
newspapers sold
– Let's first consider the expected value for each of the
demands of good, fair and poor
Demand Probability Distribution
Demand Good Fair Poor
40 0.03 0.10 0.44
50 0.05 0.18 0.22
60 0.15 0.40 0.16
70 0.20 0.20 0.12
80 0.35 0.08 0.06
90 0.15 0.04 0.00
100 0.07 0.00 0.00
54. 54
News Dealer's Problem
• Egood(X) = (40)(0.03) + (50)(0.05) + (60)(0.15) +
(70)(0.20) + (80)(0.35) + (90)(0.15) +
(100)(0.07) = 75.2
• Efair(X) = (40)(0.10) + (50)(0.18) + (60)(0.40) +
(70)(0.20) + (80)(0.08) + (90)(0.04) +
(100)(0.00) = 61
• Epoor(X) = (40)(0.44) + (50)(0.22) + (60)(0.16) +
(70)(0.12) + (80)(0.06) + (90)(0.00) +
(100)(0.00) = 51.4
Now we need to use the second distribution (of
good, fair and poor days) to determine the
overall expected value
55. 55
News Dealer's Problem
• E(X) = (Egood(X))(0.35) + (Efair(X))(0.45) +
(Epoor(X))(0.20) = 64.05
Now we utilize the expected number of
newspapers sold to find results for each of the
potential number that we stock
• Let sales = expected value calculated above
• Let stock = number vendor purchases
• Let left = stock – sales (only if stock > sales, else 0)
• Let lost = sales – stock (only if sales > stock, else 0)
• Profit = (Min(sales,stock))(0.5) – (stock)(0.33) +
(left)(0.05) – (lost)(0.17)
56. 56
News Dealer's Problem
Stock Profit
40 2.71
50 6.11
60 9.51
70 9.2
80 6.4
90 3.6
100 0.82
Expected profit
values for given
stock amounts
Note that this table
shows that 60 is
the best choice
(more or less
agreeing with the
simulation results)
57. 57
News Dealer's Problem
Is this analytical solution correct?
• Not entirely
• We are using an expected value to derive another
expected value – oversimplifying the actual analysis
• The variance from the expected value will cause our
actual results to differ
• Note that the simulation results are almost identical
to the analytical for small and large inventories
• In the middle there is more variation and this is
where using the expected value is inadequate
• However, as a basis for choosing the best number of
papers to stock, it still works
58. 58
Other Simulation Examples
There are other examples in Chapters 2 and 3
• Read over them carefully
• We may look at some of these types of simulations
later on in the term
59. 59
Simulation Software
• Simulations can be written in any good
programming language
• However, many things that need to be
done in simulations can be built into
languages to make them easier
Random values from various probability
distributions
Tools for modeling
Tools for generating and analyzing output
Graphical tools for displaying results
60. 60
Simulation Software
Look at the various described languages
Our simple queueing example (Example 2.5) is
shown using many of the languages
• Even if you don't completely understand all of the
code, look it over to note some differences
We may look at one of these packages later in
the term if we have time
61. 61
Probability and Statistics in Simulation
• Why do we need probability and statistics
in simulation?
Needed to validate the simulation model
Needed to determine / choose the input
probability distributions
• Needed to generate random samples / values from
these distributions
Needed to analyze the output data / results
Needed to design correct / efficient simulation
experiments
62. 62
Experiments and Sample Space
• Experiment
A process which could result in several different
outcomes
• Sample Space
The set of possible outcomes of a given
experiment
• Example:
Experiment: Rolling a single die
Sample Space: {1, 2, 3, 4, 5, 6}
• Another example?
63. 63
Random Variables
• Random Variable
A function that assigns a real number to each
point in a sample space
Example 5.2:
• Let X be the value that results when a single die is
rolled
• Possible values of X are 1, 2, 3, 4, 5, 6
• Discrete Random Variable
A random variable for which the number of
possible values is finite or countably infinite
• Example 5.2 above is discrete – 6 possible values
64. 64
Random Variables and Probability Distribution
• Countably infinite means the values can be mapped
to the set of integers
– Ex: Flip a coin an arbitrary number of times. Let X
be the number of times the coin comes up heads
• Probability Distribution
For each possible value, xi, for discrete
random variable X, there is a probability of
occurrence, P(X = xi) = p(xi)
p(xi) is the probability mass function (pmf) of
X, and obeys the following rules:
1) p(xi) >= 0 for all i
2) = 1
i
all
i
x
p )
(
65. 65
Random Variables and Probability Distribution
The set of pairs (xi, p(xi)) is the probability
distribution of X
Examples:
• For Example 5.2 (assuming a fair die):
– Probability Distribution:
> {(1, 1/6), (2, 1/6), (3, 1/6), (4, 1/6), (5, 1/6), (6, 1/6)}
• From Example 2.5 for Service Times
– Probability Distribution:
> {(1, 0.1), (2, 0.2), (3, 0.3), (4, 0.25), (5, 0.1), (6, 0.05)}
• From Example 2.7 for Type of Newsday
– Probability Distribution:
> {(0, 0.35), (1, 0.45), (2, 0.20)}
> Note in this case we are assigning the values 0, 1, 2 to
the outcomes somewhat arbitrarily
66. 66
Cumulative Distribution
• Cumulative Distribution Function
The pmf gives probabilities for individual values
xi of random variable X
The cumulative distribution function (cdf), F(x),
gives the probability that the value of random
variable X is <= x, or
F(x) = P(X <= x)
For a discrete random variable, this can be
calculated simply by addition:
F(x) =
x
x
i
i
x
p )
(
67. 67
Cumulative Distribution
Properties of cdf, F:
1) F is non-decreasing
2)
3)
and
P(a < X b) = F(b) – F(a) for all a < b
Ex: Probability that a roll of two dice will
result in a value > 7?
• Discuss
Ex: Probability that 10 flips of a fair coin will
yield between 6 and 8 (inclusive) heads?
• Discuss
0
)
(
lim
x
F
x
1
)
(
lim
x
F
x
68. 68
Expected Value
• Expected Value (for discrete random variables)
Also called the mean
Ex: Expected value for roll of 2 fair dice?
E(X) = (2)(1/36) + (3)(2/36) + (4)(3/36) + (5)(4/36) +
(6)(5/36) + (7)(6/36) + (8)(5/36) + (9)(4/36) +
(10)(3/36) + (11)(2/36) + (12)(1/36)
= 7
• Note that in this case the expected value is an actual
value, but not necessarily
i
all
i
i x
p
x
X
E )
(
)
(
69. 69
Expected Value and Variance
If each value has the same "probability", we
often add the values together and divide by the
number of values to get the mean (average)
• Ex: Average score on an exam
• Variance
We won't prove the identity, but it is useful
]
])
[
[(
)
( 2
X
E
X
E
X
V
)
10
.
5
(
2
2
)]
(
[
)
( Equation
X
E
X
E
70. 70
Expected Value and Variance
• In the original definition, we need to subtract the
mean from each of the X values before squaring
– So we need each X value to calculate the mean AND
AFTER the mean has been calculated
> Must look at them twice
• In the right side of the equation (Equation 5.10 in
the text), we need to calculate the mean of X and
the mean of the squares of X
– We can do this as we process the individual X values
and need to look at them only one time
• Ex: What is the variance of the following group of
exam scores: { 75, 90, 40, 95, 80 }
– Since each value occurs once, we can consider this to
have a uniform distribution
71. 71
Expected Value and Variance
• V(X) using original definition:
E(X) = (75+90+40+95+80)/5 = 76
V(X) = E[(X – E[X])2] = [(75-76)2 + (90-76)2 + (40-76)2
+ (95-76)2 + (80-76)2]/5
= (1 + 196 + 1296 + 361 + 16)/5 = 374
• V(X) using Equation 5.10
E(X) = (75+90+40+95+80)/5 = 76
E(X2) = (5625+8100+1600+9025+6400)/5 = 6150
V(X) = 6150 – (76)2 = 374
– Note that in this case we can add each number to one
sum and its square to another, so we can calculate
our overall answer with one a single "look" at each
number
72. 72
Discrete Distributions
• Discrete Distributions of interest:
Bernoulli Trials and the Bernoulli Distribution
• Consider an experiment with the following properties
– n independent trials are performed
– each trial has two possible results – success or failure
– the probability of success, p and failure, q (= 1 – p) is
constant from trial to trial
– for random variable X, X = 1 for a success and X = 0
for a failure
• Probability Distribution:
P(X = 1) = p
P(X = 0) = 1 – p = q
or 0 for all other values of X
73. 73
Bernoulli Distribution
• Expected Value
– E(X) = (0)(q) + (1)(p) = p
• Variance
– V(X) = [02q + 12p] – p2 = p(1 – p)
A single Bernoulli trial is not that interesting
• Typically, multiple trials are performed, from which
we can derive other distributions:
– Binomial Distribution
– Geometric Distribution
74. 74
Binomial Distribution
Binomial Distribution
• Given n Bernoulli trials, let random variable X denote
the number of successes in those trials
• Note that the order of the successes is not
important, just the number of successes
– Thus, we can achieve the same number of successes
in various different ways
– Since the trials are independent, we can multiply the
probabilities for each trial to get the overall probability
for the sequence
otherwise
n
x
q
p
x
n
x
p
x
n
x
,
0
,
,
1
,
0
,
)
(
75. 75
Binomial Distribution
• Recall that the number of combinations of n items
taken x at a time is
• E(X) = np
– Discuss
• V(X) = npq
• Consider an example:
– Exercise 5.1
– Read
– Do solution on board
)!
(
!
!
x
n
x
n
x
n
76. 76
Binomial Distribution
• Consider again coin-flip ex. on slide 67
• Generally speaking binomial distributions can be
used to determine the probability of a given number
of defective items in a batch, or the probability of a
given number of people having a certain
characteristic
– Ex: The trait of having a klinkled flooje occurs on
average in 10% of Kreptoplomians (krep-tō-plō'-mē-
əns). Given a group of 20 Kreptoplomians, what is
the probability that 3 of them have klinkled floojes?
P(3) = (20 C 3)(0.1)3(0.9)17 = (1140)(0.001)(0.1668)
= 0.1902
> Note that if we wanted "at least 3" the answer would
be different – how to calculate?
77. 77
Geometric Distribution
Geometric Distribution
• Given a sequence of Bernoulli trials, let X represent
the number of trials required until the first success
– i.e. we have x – 1 failures, followed by a success
– Note that the maximum probability for this is at X = 1,
regardless of p and q
• E(X) = 1/p
• V(X) = q/p2
– We will omit the proofs of the above, since they are
fairly complex (involving series solutions)
otherwise
x
p
q
x
p
x
,
0
,
2
,
1
,
)
(
1
78. 78
Geometric Distribution
• Ex: What is the probability that the first
Kreptoplomian found to have a klinkled flooje will be
the 5th Kreptoplomian overall?
(0.9)4(0.1) = 0.0656
• Ex: The probability that a certain computer will fail
during any 1-hour period is 0.001
– What is the probability that the computer will survive
at least 3 hours?
> Here p = 0.001 and q = (1 – p) = 0.999
– Using a geometric distribution, we want to solve
P(X >= 4) = 1 – P(1) – P(2) – P(3)
= 1 – (0.001) – (0.999)(0.001)
– (0.999)2(0.001)
= 0.997
79. 79
Geometric Distribution
• The Geometric Distribution is memoryless
– Consider the following two scenarios where p =
probability that a component will fail in the next
hour. Assume the current hour is hour 0.
1) What is the probability that the component will fail by
the end of hour 3?
2) What is the probability that the component will fail by
the end of hour 6, given that it has not failed by the
end of hour 3 ?
> For 1) the solution is P(1) + P(2) + P(3)
> For 2), since the component did NOT fail by the end of
hour 3, and since the probability is for the next hour
(whatever that hour may be), the solution is the same
– We can prove this property with fairly simple algebra
> First we need one additional definition
80. 80
Geometric Distribution
• The conditional probability of an event, A, given that
another event, B, has occurred is defined to be:
• Applying this to the geometric distribution we get
– Clearly, if X > s+t, then X > s (since t cannot be
negative), so we get
)
(
)
(
)
|
(
B
P
B
A
P
B
A
P
)
(
])
[
]
([
)
|
(
s
X
P
s
X
t
s
X
P
s
X
t
s
X
P
)
(
)
(
)
|
(
s
X
P
t
s
X
P
s
X
t
s
X
P
81. 81
Geometric Distribution
– Consider that P(X > s) =
– We can use similar logic to determine that
P(X > s + t) = qs+t
– Now our conditional probability becomes
– and thus we have shown that the geometric
distribution is memoryless
> We will see shortly that the exponential distribution is
also memoryless
1
1
1
1
1
1
)
1
(
s
j
j
j
s
j
j
s
j
j
q
q
q
q
p
q
s
s
s
s
s
s
s
q
q
q
q
q
q
q
3
2
2
1
1
)
(
)
|
( t
X
P
q
q
q
s
X
t
s
X
P t
s
t
s
82. 82
Poisson Distribution
Poisson Distribution
• Often used to model arrival processes with constant
arrival rates
• Gives (probability of) the number of events that occur
in a given period
• Formula looks quite complicated (and NOT discrete),
but it is discrete and using it is not that difficult
• Where is the mean arrival rate. Note that must
be positive
• E(X) = V(X) =
otherwise
x
x
e
x
p
x
,
0
,
1
,
0
,
!
)
(
83. 83
Poisson Distribution
• Note: The Poisson Distribution is actually the
convergence of the Binomial Distribution as the
number of trials, n, approaches infinity
– If we think of n as the number of subintervals of a
given unit of time
– As n , the subintervals get smaller and smaller
– We will skip the detailed math here
– One nice feature of this is that we can use a Poisson
Distribution to approximate a Binomial Distribution
when n is large
x
i
i
i
e
x
F
0 !
)
(
84. 84
Poisson Distribution
• Example
– Number of people logging onto a computer per minute
is Poisson Distributed with mean 0.7
– What is the probability that 5 or more people will log
onto the computer in the next 10 minutes?
– Solution?
>Must first convert the mean to the 10 minute period – if
mean is 0.7 in 1 minute, it will be (0.7)(10) = 7 in a ten
minute period
>Now we can plug in the formula
>P(X >= 5) = 1 – P(0) – P(1) – P(2) – P(3) – P(4)
= 1 – F(4) (where F is the cdf)
= 1 – 0.173 (from Table A.4 in the text)
= 0.827
85. 85
Continuous Random Variables
• Continuous Random Variable
Random variable X is continuous if its sample
space is a range or collection of ranges of real
values
More formally, there exists non-negative function
f(x), called the probability density function, such
that for any set of real numbers, S
a) f(x) >= 0 for all x in the range space
b) – the total area under f(x) is 1
c) f(x) = 0 for all x not in the range space
– Note that f(x) does NOT give the probability that X = x
– Unlike the pmf for discrete random variables
space
range
dx
x
f 1
)
(
86. 86
Continuous Random Variables
• The probability that X lies in a given interval [a,b] is
– We see this visually as the "area under the curve"
– Note that for continuous random variables,
P(X = x) = 0 for any x (see from formula above)
– Rather we always look at the probability of x within a
given range (although the range could be very small)
• The cumulative density function (cdf), F(x) is simply
the integral from - to x or
– This gives us the probability up to x
b
a
dx
x
f
b
X
a
P )
(
)
(
x
dt
t
f
x
F )
(
)
(
87. 87
Continuous Random Variables
• Ex: Consider the uniform distribution on the range
[a,b] (see text p. 189)
– Look at plots on board for example range [0,1]
> What about F(x) when x < a or x > b?
Expected Value for a continuous random variable
– Compare to the discrete expected value
otherwise
b
x
a
if
a
b
x
f
0
1
)
(
x
a
x
a
b
x
a
if
a
b
a
x
dy
a
b
dy
y
f
x
F
1
)
(
)
(
dx
x
xf
X
E )
(
)
(
88. 88
Continuous Random Variables
Variance for continuous random variables
• Defined in same way as for discrete variables
– Calculating it will clearly be different, however
Ex: Uniform Distribution
2
)
(
2
)
)(
(
)
(
2
)
(
2
)
(
2
2
2
a
b
a
b
a
b
a
b
a
b
a
b
a
b
x
a
b
dx
a
b
x
X
E
b
a
90. 90
Uniform Distribution
• Generally speaking we will not be calculating these
values from scratch (good news, in all likelihood!)
– However, it is good (and fun!) to see how it can be
done for at least one (simple) distribution
The Uniform Distribution will be useful primarily
in the generation of other distributions
• Ex: Most random number generators on computers
minimally will provide a uniform value from [0,1)
– We will later see how that can be used to generate
other random variates (See Chapter 8)
91. 91
Exponential Distribution
Given a value > 0, the pdf of the Exponential
Distribution is
• Since the exponent is negative, the pdf will decrease
as x increases – see shape on board and p. 191
• is the rate: number of occurrences per time unit
– Ex: arrivals per hour; failures per day
• Note that 1/ can thus be considered to be the time
between events or the duration of events
– Ex: 10 arrivals per hour 1/10 hr (6min) average
between arrivals
– Ex: 20 customers served per hour 1/20 hr (3min)
average service time
otherwise
x
e
x
f
x
,
0
0
,
)
(
92. 92
Exponential Distribution
• Some more definitions
– See shape of cdf on board and p. 191
• Ex: Assume the hard drives manufactured by
Herb's Hard Drives have a mean lifetime of 4 years,
exponentially distributed
a) What is the probability that one of Herb's Hard
Drives will fail within the first two years?
b) What is the probability that one of Herb's Hard
Drives will last at least 8 years (or twice the mean)?
x
x
t
x
e
dt
e
x
x
F
X
V
X
E
0
2
0
,
1
0
,
0
)
(
1
)
(
1
)
(
93. 93
Exponential Distribution
• Like the geometric distribution, the exponential
distribution is memoryless
– Implies that P(X > s+t | X > s) = P(X > t)
– The proof of this is similar in nature to that for the
geometric distribution
> See pp. 193 in text
• Ex: Exercise 5.19 in the text
– Component has exponential time-to-failure with
mean 10,000 hrs
a) Component has already been in operation for its mean
life. What is the prob. that it will fail by 15,000 hrs?
b) Component is still ok at 15,000 hrs. What is the prob.
that it will operate for another 5,000 hrs
The first thing we need to do here is be sure we
understand the problem correctly
94. 94
Exponential Distribution
– First, let's determine our distribution
> X is the time to failure, and is exponentially distributed
with mean 10,000 hours
> This gives us a cumulative distribution
> Here the failure rate = = 1/10000
– a) is pretty clear – we want the probability that it lasts
at most 15,000 given that it has lasted 10,000
> We want P(X <= 15000 | X > 10000)
> Due to the memoryless property of the exponential
distribution, this reduces to P(X <= 5000) which is
1 – e-1/2 = 0.3935
0
,
1
)
( 10000
x
e
x
F
x
95. 95
Exponential Distribution
– b) is trickier (and actually not phrased well)
> Do they mean the prob. that it will last exactly 5000
more hours?
If so, the probability is 0, since continuous
distributions have 0 prob. for a specific value
> Do they mean it will last at most 5000 more hours?
If so, the answer is the same as a) due to the
memoryless property
> Do they mean it will last at least 5000 more hours?
If so, we want 1 – F(5000) = 0.6065
96. 96
Gamma Distribution
Gamma Distribution
• More general than exponential
• Based on the gamma function, which is a continuous
generalization of the the factorial function
base case (1) = 1
– is called the shape parameter
– The gamma function also has , the scale parameter
– This leads the the following pdf:
when
)!
1
(
)
1
(
)
1
(
)
(
otherwise
x
e
x
x
f
x
,
0
0
,
)
(
)
(
)
(
1
97. 97
Gamma Distribution
– These formulas look complicated (and they are)
– However, they allow for more flexibility in the
distributions
> Allow more different curves (See Fig. 5.11)
> See also:
http://en.wikipedia.org/wiki/Gamma_distribution
– Note that if =1, the equations simplify to the
exponential distribution with rate
> Look at the formulas to see this
0
,
0
0
,
)
(
)
(
1
)
(
1
)
(
1
)
(
1
2
x
x
dt
e
t
x
F
X
V
x
E
x
t
98. 98
Erlang Distribution
When is an arbitrary positive integer, k, the
Gamma Distribution is also called the Erlang
Distribution of order k
• In general terms, it represents the sum of k
independent, exponentially distributed variables,
each with rate k (= )
X = X1 + X2 + … + Xk leading to
0
,
0
0
,
!
)
(
1
)
(
1
)
(
1
)
(
1
)
(
1
)
(
1
1
1
1
)
(
1
0
2
2
2
2
x
x
i
x
k
e
x
F
k
k
k
k
X
V
k
k
k
X
E
k
i
i
x
k
99. 99
Erlang Distribution
• This allows us to determine probabilities for
sequences of exponentially distributed events
– Note that the rates for all events in the sequence
must be the same
• Ex: Exercise 5.21 in text
– Time to serve a customer at a bank is exponentially
distributed with mean 50 sec
a)Probability that two customers in a row will each
require less than 1 minute for their transaction
b)Probability that the two customers together will require
less than 2 minutes for their transactions
– It is important to recognize the difference between
these two problems
100. 100
Erlang Distribution
– For a) we are looking at the probability that each of
two independent events will be < 1 minute
> In this case, the probability overall is the product of the
two probabilities, each of which is exponential
= rate = 1/mean = 1/50
We want P(X < 60) = F(60) = 1 – e-(1/50)(60) = 0.6988
> Thus the total probability is (0.6988)2 = 0.4883
– For b) we are looking at the probability that two
events together will be < 2 minutes
> In this case the probability overall is an Erlang
distribution with k = 2 and k = 1/50, and we want to
determine P(X < 120) = F(120)
> Substituting into our equation for F, we get
101. 101
Erlang Distribution
– Note that these results are fairly intuitive
> Requiring both to be < 1 minute is more restrictive a
condition than requiring the sum to be < 2 minutes,
and would seem to have a lower probability
– How about if we add another part: Probability that the
next 3 customers will have a cumulative time of more
than 2.5 minutes?
> Now we want P(X > 150) = 1 – F(150)
> But F has changed since we now have 3 events
> Let's do this one on the board
6916
.
0
)
2177
.
0
(
)
0907
.
0
(
1
!
)
50
/
120
(
1
)
120
(
1
2
0
)
50
/
120
(
i
i
i
e
F
102. 102
Normal Distribution
A very common distribution is the Normal
Distribution
• It has some nice properties
– Discuss these
• The pdf for the normal distribution is also quite
complex
– We won't even show it here
– However, we can use tables and another nice property
to allow solution for arbitrary normal distributions
)
(
))
(
max(
)
(
)
(
)
(
lim
0
)
(
lim
f
x
f
x
f
x
f
x
f
x
f x
x
103. 103
Normal Distribution
• Define (z) to be the normal distribution with mean
() 0 and variance (2) 1
= N(0, 1)
– We call this the standard normal distribution
– This can be calculated using numerical methods, and
its cdf is typically provided in tables in statistics (and
simulation) textbooks (see Table A.3 in text)
– We use the notation Z ~ N(0, 1) to mean that Z is a
random variable with a standard normal distribution
• Naturally, most normal distributions of interest will
not be the standard normal distribution
– However, Eqs 5.42 & 5.43 in the text relate any
normal distribution to the standard normal distribution
in the following way
104. 104
Normal Distribution
– Given an arbitrary normal distribution, X ~ N(, 2),
let Z = (X – )/
– Through Eq. 5.43, we know that
> where (z) is the cumulative density function for the
standard normal distribution
– Thus we can use the tabulated values of the standard
normal distribution to determine probabilities for
arbitrary normal distributions
– Ex:Student GPA's are approximately normally
distributed with =2.4 and =0.8. What fraction of
students will possess a GPA in excess of 3.0?
x
x
F )
(
Example is from From Mathematical Statistics with Applications,
Second Edition by Mendenhall, Scheaffer and Wackerly
105. 105
Normal Distribution
– Let Z = (X – 2.4)/0.8
– We want the area under the normal curve with mean
2.4 and standard deviation 0.8 where x > 3.0
This will be 1 – F(3.0)
F(3.0) = [(3.0 – 2.4)/0.8] = (0.75)
– Looking up (0.75) in Table A.3 we find 0.77337
– Recall that we want 1 – F(3.0), which gives us our
final answer of 1 – 0.77337 = 0.2266
• The idea in general is that we are moving from the
mean in units of standard deviations
– The relationship of the mean to standard deviation is
the same for all normal distributions, which is why we
can use the method indicated
106. Normal Distribution
Try another example:
• A class of 50 students have exam scores that are
normally distributed with μ= 70 and σ= 9.8
• The teacher grades on a "curve" with the following
policy for score X
X < μ-2σ F
μ- 2σ< X < μ-σ D
μ-σ < X < μ+σ C
μ+σ< X < μ+2σ B
μ+2σ< X A
> For simplicity, we will assume that no score will be
exactly μ± kσ for any k
– To the nearest integer, how many students will get B
grades?
– Discuss
106
107. 107
Other Distributions
There are a LOT of probability distributions
• More in the text that we did not discuss
• Many others not in the text
For simulation, the idea for using them is:
• How well does the distribution of choice model the
actual distribution of events / times that are relevant
to our model
• The more possibilities and variations, the more
closely we can model our actual behavior
• However, we need to be able to determine if a
distribution fits observed data
– We will look at this in Chapter 9
108. 108
Poisson Arrival Process
Before we finish Ch. 5, let's revisit the Poisson
Distribution
• In this case, indicates the mean value, or number
of arrivals (total)
– Does not factor in arrivals over time
– However, this can be done, and in this case we say
the arrivals follow a Poisson Process
> In this case we are counting the number of arrivals over
time
– However, some rules must be followed
x
i
i
x
i
e
x
F
otherwise
x
x
e
x
p
0 !
)
(
,
0
,
1
,
0
,
!
)
(
109. 109
Poisson Arrival Process
1) Arrivals occur one at a time
2) The number of arrivals in a given time period depends
only on the length of that period and not on the starting
point
– i.e. the rate does not change over time
3) The number of arrivals in a given time period does not
affect the number of arrivals in a subsequent period
– i.e. the number of arrivals in given periods are
independent of each other
– Discuss if these are realistic for actual "arrivals"
• We can alter the Poisson distribution to include time
– Only difference is that t is substituted for
otherwise
n
n
t
e
n
t
N
P
n
t
,
0
,
1
,
0
,
!
)
(
]
)
(
[
110. 110
Poisson Arrival Process
• Just like the Poisson Distribution, V = E = = t
• In fact, if you look at the example from slide 84, we
are in fact using the Poisson Arrival Process there
The Poisson Arrival Process has some nice
properties
• The three required from the previous slide (obviously)
– These imply that the arrivals follow an exponential
distribution
• Random Splitting
– Consider a Poisson Process N(t) with rate t
– Assume that arrivals can be divided into two groups, A
and B with probability p and (1-p), respectively
> Show on board
111. 111
Poisson Arrival Process
– In this case N(t) = NA(t) + NB(t)
– NA is a Poisson Process with rate p and NB is a
Poisson Process with rate (1-p)
– Splitting can be used in situations where arrivals are
subdivided to different queues in some way
> Ex: At immigration US citizens vs. non-US citizens
• Pooled Process
– Consider two Poisson Processes N1(t) and N2(t), with
rates 1 and 2
– The sum of the two, N1,2(t) is also a Poisson Process
with rate 1 + 2
– Pooling can be used in situations where multiple
arrival processes feed a single queue
> Ex: Cars arrive in New York City from many bridges and
tunnels, each at a different rate
112. 112
Poisson Arrival Process
• Ex: Exercise 5.28 in text:
– An average of 30 customers per hour arrive at the
Sticky Donut Ship in accordance with a Poisson
process. What is the probability that more than 5
minutes will elapse before both of the next two
customers walk through the door?
> As usual, the first thing is to identify what it is that we
are trying to solve.
> Discuss (and see Notes)
> Note: We could also model this as Erlang Discuss
– [I added this part] If (on average) 75% of Sticky
Donut Shop's customers get their orders to go, what
is the probability that 3 or more new customers will sit
in the dining room in the next 10 minutes?
> Discuss
113. 113
Brief Intro. to Monte Carlo Simulation
• We discussed previously that our simulations
will typically follow the dynamic model
Progress over time
• Stochastic simulations using the static model
are often called Monte Carlo Simulations
Idea is to determine some quantity / value using
random numbers that could be very difficult to
do by other means
• Ex: Evaluating an integral that has no closed analytical
form
114. 114
Brief Intro. to Monte Carlo Simulation
• Before any formal definitions, let's consider
a simple example
Let's assume we don't know the formula for the
area of a circle, but we do know the formula for
the area of a square
We'd like to somehow find the area of a circle
of a given radius (let's say 1)
2
115. 115
Brief Intro. to Monte Carlo Simulation
Let's generate a (large) number of random
points known to be within the square
• We then test to see if each point is also within the
circle
– Since we know the circle has a radius of 1, we can put
its center at the origin and any random point a
distance <= 1 from the origin is within the circle
• The ratio of points in the circle to total points
generated should approximate the ratio of the area
of circle to the area of the square
• We can then calculate the area of the circle by
multiplying the area of the square by that ratio
• See Circle.java
116. 116
Brief Intro. to Monte Carlo Simulation
• Some informal theory behind M.C.
Empirical probability
• Consider a random experiment with possible
outcome C
• Run the experiment N times, counting the number of
C outcomes, NC
• The relative frequency of occurrence of C is the ratio
NC/N
• As N , NC/N converges to the probability of C, or
N
N
C
p C
N
lim
)
(
117. 117
Brief Intro. to Monte Carlo Simulation
Axiomatic probability
• Set theoretic approach that determines probabilities
of events based on the number of ways they can
occur out of the total number of possible outcomes
• Gives the "true" probability of a given event,
whereas empirical probability only gives an estimate
(since we cannot actually have N be infinity)
• However, for complex situations this could be quite
difficult to do
When axiomatic probability is not practical,
empirical probability (via Monte Carlo sims) can
often be a good substitute
• Can also be used to verify axiomatic results
118. 118
Let's Make a Deal
• Ex: Famous Let's Make a Deal problem
Player is given choice of 3 curtains
• One has a grand prize
• Other two are duds
After player chooses a curtain, Monty shows one
of the other two, which has a dud
• Now player has option to keep the same curtain or to
switch to the remaining curtain
What should player do?
At first thought, it seems like it should not matter
• However, it DOES matter – player should always switch
119. 119
Let's Make a Deal
We can look at this axiomatically
• Initially there is a 1/3 probability that the player's
choice is correct and 2/3 that it is incorrect
• Revealing an incorrect curtain does not change that
probability, so if the user does not switch his/her
chance of winning is still 1/3
• However, now what do we know?
– There is a 2/3 chance that the winning curtain is NOT
the one originally picked
– Of that 2/3, there is a 0 chance that it is the curtain
already revealed
– Therefore, there is a 2/3 chance that the remaining
curtain is the winner, so we should switch to it
120. 120
Let's Make a Deal
In case we are still skeptical, we can verify this
result with a Monte Carlo Simulation
• See MontyMonte.java
Note that the larger our number of trials, the
better our result agrees with the axiomatic
result
121. 121
Monte Carlo Integration
• Let's apply this idea to another common
problem – evaluating an integral
Many integrals have no closed form and can
also be very difficult to evaluate with
"traditional" numerical methods
How can we utilize Monte Carlo simulation to
evaluate these?
Let's look at this in a somewhat simplified way
(i.e. we will be light on the theory)
122. 122
Monte Carlo Integration
Consider function f(x) that is defined and
continuous on the range [a,b]
• The first mean value theorem for integral calculus
states that there exists some number c, with a < c <
b such that:
– The idea is that there is some point within the range
(a,b) that is the "average" height of the curve
– So the area of the rectangle with length (b-a) and
height f(c) is the same as the area under the curve
b
a
b
a
c
f
a
b
dx
x
f
or
c
f
dx
x
f
a
b
)
(
)
(
)
(
)
(
)
(
1
123. 123
Monte Carlo Integration
So now all we have to do is determine f(c) and
we can evaluate the integral
We can estimate f(c) using Monte Carlo
methods
• Choose N random values x1, … , xN in [a,b]
• Calculate the average (or expected) value, ḟ(x) in
that range:
• Now we can estimate the integral value as
)
(
)
(
1
)
(
1
c
f
x
f
N
x
f
N
i
i
b
a
x
f
a
b
dx
x
f )
(
)
(
)
(
124. 124
Monte Carlo Integration
There is some error in this, but as N the
error approaches 0
• It is inversely proportional to the square root of N
• Thus we may need a fairly large N to get satisfactory
results
Let's look at a few simple examples
• In practice, these would be solved either analytically
or through other numerical methods
• Monte Carlo methods are most useful for multiple
integrals that are not analytically solvable
• See Monte.java
125. 125
Simulated Annealing
• Simulated Annealing
Yet another interesting use of Monte Carlo
Simulation
• See: http://en.wikipedia.org/wiki/Simulated_annealing
Idea:
• Mimic physical annealing processes used in materials
science
– What is annealing?
– See: http://en.wikipedia.org/wiki/Annealing_%28metallurgy%29
• Goal is to obtain a global optimum for some problem
by randomly changing candidate solutions to
"neighbor" solutions
126. 126
Simulated Annealing
• From a given solution pick a random "neighbor"
solution
– If that solution is "better", keep it
– If that solution is "worse", keep it with some
probability
> This probability depends on several factors, including
the "temperature" of the system
– Over time gradually decrease the temperature
• The possibility of choosing a "worse" solution allows
the system to "back out" of a local optimum, keeping
it alive to get to a better solution
This is very cool!
127. 127
Simulated Annealing
As an example consider a famous NP-Complete
problem – Traveling Salesman Problem (TSP)
• Given a completely connected graph with weighted
edges, what is the shortest cycle that visits each vertex
exactly one time
– i.e. what is the shortest route that a traveling salesman
can take to see customers in all cities
• Deterministically this is very difficult to solve
– No algorithm has been developed with less than
exponential run-time
• Can we perhaps get better results using SA?
128. 128
Simulated Annealing
• TSP via Simulated Annealing
Idea:
• A solution for TSP is simply a permutation of the
vertices
• At each iteration in the annealing process, "mutate"
the current solution by either reversing a few cities in
the cycle or cutting a few out and pasting them
elsewhere
– To keep the new solution as a "neighbor" of the
original solution only a small fraction of the nodes in
the solution can be changed
– If the new solution is shorter, keep it
– If the new solution is longer, keep it with a small
probability
129. 129
Simulated Annealing
• Specifically, the probability is
– Where –deltaLength is the negative difference in the path
lengths of the old and new solutions
• Note two important trends from this formula
– As deltaLength increases, probability of acceptance decreases
> We don’t want to take a solution whose length is MUCH worse
than the current one
– As temperature decreases, probability of acceptance decreases
> Initially (high temp) we want these to be more likely
> As the system cools (i.e. approaches a more stable state) we
want these to be less likely
• For more info see:
– http://www.codeproject.com/KB/recipes/simulatedAnnealingTSP.aspx
– http://www.svengato.com/salesman.html
e
temperatur
h
deltaLengt
e
130. 130
Simple Queueing Theory
• Many simulations involve use of one or
more queues
People waiting in line to be served
Jobs in a process or print queue
Cars at a toll booth
Orders to be shipped from a company
• Queueing Theory can get quite complex
We are interested in a few of the more
important results / guidelines
131. 131
Simple Queueing Theory
• First, we should define queue characteristics
in a consistent way
Standard Queueing Notation:
A/B/c/N/K
• where A is the interarrival time distribution
• where B is the service-time distribution
– A and B can follow any distribution (ex: the ones we
discussed in Chapter 5):
> D Deterministic
Distribution is not random (ex: real data that has
been measured / calculated)
132. 132
Simple Queueing Theory
> M Exponential (Markov)
This is probably the most common and most
studied of the random distributions – more on
this below
> Ek Erlang of order k
> G General
So why is an exponential distribution called
Markov?
• Relates to Markov processes (continuous time) and
Markov chains (discrete time)
• Let's consider a Markov chain for now (it is easier to
conceptualize)
133. 133
Simple Queueing Theory
• A set of random variables (or states) X1, X2, … forms
a Markov Chain if the probability that we transition
from state Xi to state Xi+1 does not depend on any of
the previous states X1, … Xi-1
– In other words, the past history of the chain does not
affect its future
– The idea is the same for a continuous time Markov
process
• Let's consider now a random variable Y that
describes how long a system will be in one state
before transitioning to a different state
– For example, in a queueing system, this could model
how long before another arrival into the system
(which changes the system state)
134. 134
Simple Queueing Theory
– This time should not depend on how long the process
has been in the current state
> i.e. it must be memoryless
– As we discussed in Chapter 5, the exponential
distribution is the only continuous distribution that is
memoryless
– Thus, when arrivals or services times are
exponentially distributed, they are often called
Markovian
– We will touch on a bit more of this theory later
– Now back to our description of the terminology…
A/B/c/N/K
135. 135
Simple Queueing Theory
• where c is the number of parallel servers
– A single queue could feed into multiple servers
• where N is the system capacity
– Could be "infinite" if the queue can grow arbitrarily
large (or at least larger than is ever necessary)
> Ex: a queue to go up the Eiffel Tower
– Space could be limited in the system
> Ex: a bank or any building
> This can affect the effective arrival rate, since some
arrivals may have to be discarded
• where K is the size of the calling population
– How large is the pool of customers for the system?
– It could be some relatively small, fixed size
> Ex: The computers in a lab that may require service
136. 136
Simple Queueing Theory
– It could be very large (effectively infinite)
> Ex: The cars coming upon a toll booth
– The size of the calling population has an important
effect on the arrival rate
> If the calling population is infinite, customers that are
removed from the population and enter the queueing
system do not affect the arrival rate of future customers
(-1 = )
> If the calling population is finite, removal of customers
from the population (and putting them into and later
out of the system) must affect future arrival rates
Ex: In a computer lab with 10 computers, each has a
10% chance of going down in a given day. If a
computer goes down, the repair takes a mean of 2
days, exponentially distributed
137. 137
Simple Queueing Theory
In the first day, the expected number of failures is 10*0.1=1
However, once a failure occurs, the faulty computer is out of
the calling population, so the expected number of failures in
the next day is 9 * 0.1 = 0.9
Clearly this changes again if another computer fails
• What about the system you are testing in Programming
Assignment 1?
– We cannot actually classify it with this notation, due to the
single initial queue branching into multiple queues for the
toll booths
– However, if we consider only the toll booth queues, we
could make each queue M/M/1/10/
> However, since a single queue buffers all of the individual
queues, the arrival rate into the queues will change if traffic
backs up onto the single queue
138. 138
Long-Run Measures of Performance
• What are some important queueing
measurements?
L = long-run average number of customers in
the system
LQ = long-run average number in queue
w = long-run average time spent in system
wq = long-run average time spent in queue
= server utilization (fraction of time server is
busy)
139. 139
Time-Average Number in System
• Let's discuss these in more detail
Time-Average Number in the System, L
• Given a queueing system operating for some period
of time, T, we'd like to determine the time-weighted
average number of people in the system,
• We'd also like to know the time-weighted average
number of people in the queue,
– Note that for a single queue with a single server, if
the system is always busy,
> However, it is not the case when the server is idle part
of the time
– The "hats" indicate that the values are "estimators"
rather than analytically derived long-term values
L
Q
L
1
L
LQ
140. 140
Time-Average Number in System
• We can calculate for an interval [0,T] in a fairly
straightforward manner using a sum:
– Note that each Ti here represents the total time that
the system contained exactly i customers
> These may not be contiguous
– i is shown going to infinity, but in reality most queuing
systems (especially stable queuing systems) will have
all Ti = 0 for i > some value
> In other words, there is some maximum number in the
system that is never exceeded
– See GrocerySimB.java
L
0
1
i
i
T
i
T
L
141. 141
Time-Average Number in System
Let's think of this value in another way:
• Consider the number of customers in the system at
any time, t
– L(t) = number of customers in system at time t
• This value changes as customers enter and leave the
system
• We can graph this with t as the x-axis and L(t) as the
y-axis
• Consider now the area under this plot from [0, T]
– It represents the sum of all of the customers in the
system over all times from [0, T], which can be
determined with an integral
T
dt
t
L
Area
0
)
(
142. 142
Time-Average Number in System
• Now to get the time-average we just divide by T, or
– See on board
• For many stable systems, as T (or, practically
speaking, as T gets very large) approaches L, the
long-run time-average number of customers in the
system
– However, initial conditions may determine how large T
must be before the long-run average is reached
The same principles can be applied to , the
time-average number in the queue, and LQ, the
long-run time average number in the queue
T
i
i dt
t
L
T
T
i
T
L
0
0
)
(
1
1
Q
L
L
143. 143
Average Time in System Per Customer
Average Time in System Per Customer, w
• This is also a straightforward calculation during our
simulations
– where N is the number of arrivals in the period [0,T]
– where each Wi is the time customer i spends in the
system during the period [0,T]
• If the system is stable, as N , ŵ w
– w is the long-run average system time
• We can do similar calculations for the queue alone to
get the values ŵQ and wQ
– We can think of these values as the observed delay
and the long-run average delay per customer
N
i
i
W
N
w
1
1
144. 144
Arrival Rates, Service Rates and Stability
We have seen a few times now
• "for stable systems…"
• What does this mean and what are the implications?
For simple queueing systems such as those we
have been examining, stability can be
determined in a fairly easy way
• The arrival rate, , must be less than the service rate
– i.e. customers must arrive with less frequency than
they can be served
• Consider a simple single queue system with a single
server (G/G/1//)
– Define the service rate to be
– This system is stable if <
145. 145
Arrival Rates, Service Rates and Stability
– If > , then, over a period of time there is a net
rate of increase in the system of –
> This will lead to increase in the number in the system
(L(t)) without bound as t increases
– Note if == some systems (ex: deterministic) may
be stable while others may not be
• If we have a system with multiple servers (ex:
G/G/c//) then it will be stable if the net service
rate of all servers together is greater than the arrival
rate
– If all servers have the same rate , then the system is
stable if < c
• See GrocerySimB.java
146. 146
Arrival Rates, Service Rates and Stability
• Note that if our system capacity or calling population
(or both) are fixed, our system can be stable even if
the arrival rate exceeds the service rate
– Ex: G/G/c/k/
> With a fixed system capacity, in a sense the system is
unstable until it "fills" up to k. At this point, excess
arrivals are not allowed into the system, so it is stable
from that point on
> The idea here is that the net arrival rate decreases once
the system has filled
– Ex: G/G/c//k
> With a fixed calling population, we are in effect
restricting the arrival rate
> As arrivals occur, the arrival rate decreases and the
system again becomes stable
147. 147
Conservation Law
An important law in queueing theory states
L = w
• where L is the long-run number in the system, is
the arrival rate and w is the long-run time in the
system
– Discuss intuitively what this means
• Often called "Little's Equation"
– http://en.wikipedia.org/wiki/Little's_theorem
• This holds for most queueing systems
• Text shows the derivation
– Read it but we will not cover it in detail here
148. 148
Server Utilization
Server Utilization
• What fraction of the time is the server busy?
• Clearly it is related to the other measures we have
discussed (we will see the relationship shortly)
• As with our other measures we can calculate this for
a given system (G/G/1//)
– We assume that if at least 1 customer is in the
system, the server will be busy (which is why we start
at T1 rather than T0)
• However, we can also calculate the server utilization
based on the arrival and service rates
1
1
i
i
T
T
149. 149
Server Utilization
G/G/1// systems
• Consider again the arrival rate and the service rate
• Consider only the server (without the queue)
• With a single server, it can be either busy or idle
• If it is busy, there is 1 customer in the "server
system", otherwise there are 0 customers in the
"server system" (excluding the queue)
– Thus we can define Ls = = average number of
customers in the "server system"
• Using the conservation equation this gives us
– Ls = sws
> where s is the rate of customers coming into the server
and ws is the average time spent in the server
150. 150
Server Utilization
– For the system to be stable, s = , since we cannot
serve faster than customers arrive and if we serve
more slowly the line will grow indefinitely
– The average time spent in the server, ws, is simply
1/ (i.e. 1/(rate of the server))
– Putting these together gives us
– = Ls = sws = (1/) = /
– Note that this indicates that a stable queueing system
must have a server utilization of less than 1
> The closer we get to one, the less idle time for the
server, but the longer the lines will be (probably)
> Actual line length depends a lot not just on the rates,
but also on the variance – we will discuss shortly
– See GrocerySimB.java
151. 151
Server Utilization
G/G/c// systems
• Applying the same techniques we did for the single
server, recalling that for a stable system with c
servers:
< c
• we end up with the the final result
= /c
152. 152
Markov Systems in Steady-State
• Consider stable M/G/c queueing systems
• Exponential interarrival times
• Arbitrary service times
• 1 or more servers
As long as they are kept relatively simple, we
can calculate some of the long-run steady state
performance measures for these analytically
May enable us to avoid a simulation for simple
systems
May give us a good starting point even if the
actual system is more complicated
153. 153
Markov Systems in Steady-State
First, what do we mean by steady-state?
• The probability that the system is in a particular
state is not time-dependent
• For example, consider a stable queueing system S
– What is the probability that fewer than 10 people are
in the queue?
– If we start with an empty queue, this probability is
initially 1
– However, as the system runs over time, the length of
the queue will approach its long-run average length LQ
> It no longer depends on the initial state
> The actual length will still vary, but only due to
variations in the arrival and service times
154. 154
Markov Systems in Steady-State
Let's look at it another way
• Consider a stable M/G/1 queueing system with a
given long-run average customer delay wQ
• We start the system with a certain number of
customers, s, in the queue
• For each customer processed, we re-calculate the
mean delay up to that point
– In other words (ŵQ)j = average customer delay up to
and including customer j
• As j increases, (ŵQ)j wQ , despite the different
starting values of s
– When this occurs the steady-state has been reached
• See graph from Law text on board
155. 155
Markov Systems in Steady-State
We will not concentrate on the derivation of the
formulas, but it is useful to know them and
how to use them
M/G/1 Queue (see p. 248 of text for complete
list)
)
1
(
2
)
1
(
)
1
(
2
)
1
(
2
2
2
2
2
2
Q
L
L
Note that is as we previously
defined it (and is equal to the
average number in the server)
Note that 2 is the variance in
the service times
Note that (intuitively) the
long-run queue length is equal
to the long-run number in the
system minus the average
number in the server
156. 156
Markov Systems in Steady-State
Let's look at these a bit more closely
• Consider first if 2 = 0
– i.e. the service times are all the same (= mean)
> For example a deterministic distribution
– In this case the equations for L and LQ greatly
simplified to
– In this case LQ is dependent solely upon the server
utilization,
– Note as 0 (low server utilization) LQ 0
– Note as 1 (high server utilization) LQ
)
1
(
2
)
1
(
2
)
0
1
( 2
2
2
2
Q
L
157. 157
Markov Systems in Steady-State
• However, as 2 increases with a fixed utilization, LQ also
increases
– Other measures such as w and wQ increase with 2 as well
– This indicates that all other factors being equal, a system
with a lower variance will tend to have better performance
> In fact (See Ex. 6.9) in some cases a lower will give shorter
lines than a higher , if it also has a lower 2
Able: 1/ = 24min, 2 = 400min2
Baker: 1/ = 25min, 2 = 4min2
Able: LQ = 2.711, P0 = 0.20
Baker: LQ = 2.097, P0 = 0.167
> Note that Able has a longer long-run queue length despite
his faster rate
> However, he also has a higher P0, indicating that more
people experience no delay
158. 158
Markov Systems in Steady-State
We can generalize this idea, comparing various
distributions using the coefficient of variation,
cv:
(cv)2 = V(X)/[E(X)]2
– i.e. it is the ratio of the variance to the square of the
expected value
• Distributions that have a larger cv have a larger LQ
for a given server utilization, ρ
– In other words, their LQ values increase more quickly
as ρincreases
– Ex: Consider an exponential service distribution: V(X)
= 1/μ2 and E(X) = 1/μ, so (cv)2 = (1/μ)2/[(1/μ)]2 = 1
> See chart from 4th Edition of text
159. 159
Markov Systems in Steady-State
Consider again the special case of an
exponential service distribution (M/M/1), then
• The mean service time is 1/ and the variance on
the service time is 1/2
This simplifies our equations to
)
1
(
)
1
(
2
)
1
1
(
)
1
(
)
1
(
)
1
(
)
1
(
)
1
(
2
)
1
1
(
2
2
2
2
2
2
2
2
Q
L
L
160. 160
Markov Systems in Steady-State
• The M/M/1 queue gives us a closed form expression
for a measure that the M/G/1 queue does not:
– Pi , the long-run probability that there will be exactly i
customers in the system
– Note: We can use this value to show L for this system
> The last equality is based on the solution to an infinite
geometric series when the base is < 1
> We know ρ < 1 since system must be stable
n
n
P
)
1
(
)
1
1
(
)
1
(
3
3
2
2
0
]
1
[
3
]
1
[
2
]
1
[
1
]
1
[
0
3
2
4
3
3
2
2
3
2
1
0
0
i
i
P
i
L
161. 161
Markov Systems in Steady-State
The text has formulas for a number of different
queue possibilities:
• M/G/1 already discussed
• M/M/1 already discussed
• M/M/c Exponential arrivals and service with
multiple servers
• M/G/ "Infinite" servers (or service capacity >>
arrivals, or "how many servers are needed"?)
• M/M/c/N/ Limited capacity system
• M/M/c/K/K Finite population
Let's look at an example where these formulas
would be used
162. 162
Markov Systems in Steady-State
Exercise 6.6
– Patients arrive for a physical exam according to a
Poisson process at the rate of 1/hr
– The physical exam requires 3 stages, each one
independently and exponentially distributed with a
service time of 15min
– A patient must go through all 3 stages before the next
patient is admitted to the facility
– Determine the average number of delayed patients,
LQ, for this system
– Hint: The variance of the sum of independent random
variables is the sum of the variance
> Discuss and develop the solution
163. 163
Markov Systems in Steady-State
Exercise 6.23
– Copy shop has a self-service copier
– Room in the shop for only 4 people (including the
person using the machine)
> Others must line up outside the shop
> This is not desirable to the owners
– Customers arrive at rate of 24/hr
– Average use time is 2 minutes
– What impact will adding another copier have on this
system
• How to solve this?
– Determine the systems in question
– Calculate for each system the probability that 5 or
more people will be in the system at steady-state
164. 164
Markov Systems in Steady State
Exercise 6.21:
– A large consumer shopping mall is to be constructed
– During busy periods the expected rate of car arrivals
is 1000 per hour
– Customers spend 3 hrs on average shopping
– Designers would like there to be enough spots in the
lot 99.9% of the time
– How many spaces should they have?
• We will model this as an M/G/ queue, where the
spaces are the servers
– We want to know how many servers, c, are necessary
so that the probability of having c+1 or more
customers in the system is < 0.001
165. 165
Markov Systems in Steady State
Text also has some examples showing the
M/M/c/N Queue and the M/M/c/K/K Queue
• Idea is similar but the formulas and applications are
different
166. 166
Markov Systems in Steady State
• Before we finish with this topic…
Let's just talk a LITTLE bit about how these
formulas are derived (being light on theory)
Let's look at the simplest case, an M/M/1 queue
with arrival rate and service rate
Consider the state of the system to be the
number of customers in the system
We can then form a state-transition diagram for
this system
• A transition from state k to state k+1 occurs with
probability
• A transition from state k+1 to state k occurs with
probability
167. 167
Markov Systems in Steady State
From this we can obtain
We also know that
• Since the sum of the probabilities in the distribution
must equal 1
• This will allow us to solve for P0
0 1 2 k
k-1 k+1
…
)
1
.
138
.
(
0
1
0
0 Eq
P
P
P
k
k
i
k
0
1
k
k
P
168. 168
Markov Systems in Steady State
1
0
1
0
0
0
0
1
1
1
1
1
1
k
k
k
k
k
k
k
k
P
P
P
P
•All that is needed for this derivation is
fairly simple algebra
•Before completing the derivation, we
must note an important requirement: to
be stable, the system utilization
/ < 1
must be true
•This allows the series to converge
See:
http://mathworld.wolfram.com/GeometricSeries.html
169. 169
Markov Systems in Steady State
• Which is the solution for P0 from the M/G/1 Queue in
Table 6.3
Utilizing Eq. 138.1, we can substitute back to get
• Which is the formula indicated in Table 6.4
• The other values can also be derived in a similar
manner
1
1
)
/
(
1
1
1
)
/
(
1
/
)
/
(
1
)
/
(
1
1
)
/
(
1
/
1
1
0
0
P
P
k
k
k
k P
P
)
1
(
1
0
170. 170
Markov Systems in Steady State
A lot of theory has been left out of this process
If you want to read more about queueing theory,
there are a number of books on the topic:
• Queueing Systems Volume 1: Theory by L. Kleinrock
(Wiley and Sons, 1975)
– Old but still relevant and still available – well known text
on the topic
• Many newer textbooks as well – Go to Amazon.com
select "Books" and search "Queueing Theory"
One final note: spelling!
• Both "Queueing Theory" and "Queuing Theory" seem to
be acceptable (and both spellings are in dictionary)
– Search both to cover all options
171. 171
Random Numbers
• Stochastic simulations require random data
If we want true random data, we CANNOT use an
algorithm to derive it
• We must obtain it from some process that itself has
random behavior
– http://en.wikipedia.org/wiki/Hardware_random_number_generator
• Ex: thermal noise
– http://noosphere.princeton.edu/reg.html
• Ex: atmospheric noise
– http://www.random.org/randomness/
• Ex: radioactive decay
– http://www.fourmilab.ch/hotbits/
– http://www.enginova.com/radioactive_random_number_gen
era.htm
172. 172
Pseudo-Random Numbers
However, these require either purchase of
hardware or a service
For stochastic simulations we also need very
large amounts of random data quickly
• Possibly more than can be obtained from a true
random source in a reasonable amount of time
Often in testing and running our simulations, we
also want to reuse the same "random" values
• If we are debugging it helps to be able to see the
execution repeatedly with the same data
• If we are comparing systems, we may want to use the
same data on both systems
173. 173
Pseudo-Random Numbers
• Thus, more often than not, simulations rely
on pseudo-random numbers
These numbers are generated deterministically
(i.e. can be reproduced)
However, they have many (most, we hope) of
the properties of true random numbers:
• Numbers are distributed uniformly on [0,1]
– Assuming a generator from [0,1), which is the most
common
• Numbers should show no correlation with each other
– Must appear to be independent
– There are no discernable patterns in the numbers
174. 174
Pseudo-Random Numbers
Let's look at a decidedly non-random
distribution in the range [0,1)
• Ex: Assume 100 numbers, X1…X100 are generated
X1 = 0.00;
Xi = Xi-1 + 0.01
– This will generate the sequence of numbers
0.00, 0.01, 0.02, 0.03, … , 0.99
– Clearly these numbers are uniformly distributed
throughout the range of values
– However, also as clearly they are not independent and
thus would not be considered to be "random"
– Often, uniformity and independence are not as
obvious, and must be determined mathematically
> We will discuss how to test for each shortly
175. 175
Linear Congruential Generators
• Linear Congruential Generators
These are perhaps the best known pseudo-
random number generators
• Easy and fairly efficient
– Depending upon parameter choices
• Simple to reproduce sequences
• Give good results (for the most part, and when used
properly)
– Again depending upon parameter choices
Based on mathematical operations of
multiplication and modulus
176. 176
Linear Congruential Generators
Standard Equation:
X0 = seed value
Xi+1 = (aXi + c) mod m for i = 1, 2, …
– where
a = multiplier
c = increment
m = modulus
• For c == 0 it is called a multiplicative congruential
generator
• For c != 0 it is called a mixed congruential generator
– Both can achieve good results
• Initially proposed by Lehmer and studied extensively
by Knuth
– See references at end of the chapter
177. 177
Linear Congruential Generators
Note that the generator as shown will produce
integers
If we want numbers in the range [0,1) we will
have to convert the integer to a float in some
way
• This can be done in a fairly straightforward way by
dividing by m
• However, sophisticated generators can do the
conversion in a better (more efficient) way
• Clearly, however, the more possible integers, the
denser the values will be in the range [0,1)
178. 178
Linear Congruential Generators
For now, consider three properties
• The density of the distribution
– How many different values in the range can be
generated?
• The period of the generator
– How many numbers will be generated before the
generator cycles?
> Since the values are deterministic this will inevitably
happen
> Clearly, a large period is desirable, especially if a lot of
numbers will be needed
> A large period also implies a denser distribution
• The ease of calculation
– We'd like the numbers to be generated reasonably
quickly, with few complex operations
179. 179
Linear Congruential Generators
Consider a multiplicative linear congruential
generator (i.e. c == 0)
Xi+1 = (aXi) mod m
• If m is prime, this will produce a maximum period of
(m–1) if ak – 1 is not divisible by m for all k < (m–1)
– The period and density here are good, but with m as a
prime, the mod calculation is somewhat time-consuming
• If m = 2b for some b, this will produce a maximum
period of 2b-2 if X0 is odd and either a = 3 + 8k or a = 5
+ 8k for some k
– This allows more efficient calculation of the numbers, but
gives a smaller period
– The importance of a good seed also makes this less
practical (programmer must understand algorithm and
seed requirements to use generator effectively)
180. 180
Linear Congruential Generators
Consider a mixed linear congruential generator
(i.e. c != 0)
Xi+1 = (aXi + c) mod m
• If m = 2b for some b, this will produce a maximum
period of 2b if c and m are relatively prime and a =
(1+4k) for some k
• This allows a good period, an easy mod calculation
with the relatively small overhead of an addition
• This is the generator used by Java in the JDK
• Let's take a look at that in more detail
• See JDKRandom.java
• See TestRandom.java
181. 181
Quality of Linear Congruential Generators
The previous criteria for m, a and c can
guarantee a full period of m (or m-1 for
multiplicative congruential generators)
• However, this does not guarantee that the generator
will be good
• We must also check for uniformity and independence
in the values generated
• General criteria for good values of m, a and c are still
unsure
• If you are creating a new LCG, a good rule of thumb
is:
– Choose m, a and c to guarantee a good period
– Test the resulting generator for uniformity and
independence
182. 182
Chi Square Testing for Uniformity
• Idea:
Consider discrete random variable X, which has
possible values x1, x2, …, xk and probabilities
p1, p2, …, pk (which sum to 1)
Assume n random values for X are chosen
Then the expected number of times each value
will occur Ei = npi
Now assume n random values of distribution Y
thought to be the same as X are chosen
• How can we tell if distribution Y is the same as X?
• We can at least get a good idea if the occurrences
more or less match those of the Ei