SlideShare una empresa de Scribd logo
1 de 49
Descargar para leer sin conexión
Group 4 report
Trever Hallock Tamlyn Harley Matthew Stanley
Cody Zhang
December 11, 2014
Abstract
We developed a MATLAB model that took in a city object and a
solution matrix to output the efficiency of the solution based on multi-
ple metrics. Also, we worked with the User Interface Team to translate
between their Excel-based output and our MATLAB model. Finally,
we worked to find and simulate more subtle solution characteristics
than what were explored by the other teams. We produced a software
package that was able to meet our goals and simulate the models given
by the other teams involved.
1
Contents
1 Introduction 5
1.1 Problem description . . . . . . . . . . . . . . . . . . . 5
1.2 Overview of our Role . . . . . . . . . . . . . . . . . . . 6
1.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 The Value of the Simulator . . . . . . . . . . . . . . . 7
2 MATLAB Model 8
2.1 MATLAB Object Model . . . . . . . . . . . . . . . . . 8
2.1.1 City Representation . . . . . . . . . . . . . . . 8
2.1.2 Solution Representation . . . . . . . . . . . . . 11
2.2 Route Simulation . . . . . . . . . . . . . . . . . . . . . 12
2.3 City Generation . . . . . . . . . . . . . . . . . . . . . . 14
2.4 City Translation . . . . . . . . . . . . . . . . . . . . . 14
3 Model 15
3.1 Model Assumptions . . . . . . . . . . . . . . . . . . . . 15
3.2 Model Decisions . . . . . . . . . . . . . . . . . . . . . . 16
3.2.1 Solutions Use Indices . . . . . . . . . . . . . . . 16
3.3 City Streets . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.1 Modeling Language . . . . . . . . . . . . . . . . 17
3.3.2 Simulator output . . . . . . . . . . . . . . . . . 17
3.4 Model Variables . . . . . . . . . . . . . . . . . . . . . . 18
3.4.1 Parameters . . . . . . . . . . . . . . . . . . . . 20
3.4.2 The City . . . . . . . . . . . . . . . . . . . . . 21
3.5 Other thoughts . . . . . . . . . . . . . . . . . . . . . . 22
3.6 Decision Variables . . . . . . . . . . . . . . . . . . . . 23
2
3.7 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.7.1 Number of Requests Serviced . . . . . . . . . . 24
3.7.2 Time Taken to Service Requests . . . . . . . . 24
3.7.3 Distance Covered . . . . . . . . . . . . . . . . . 25
3.7.4 Landfill Fees . . . . . . . . . . . . . . . . . . . 25
3.7.5 Remaining Inventories . . . . . . . . . . . . . . 25
3.8 Constraints . . . . . . . . . . . . . . . . . . . . . . . . 26
3.8.1 Driver Routes must not overlap . . . . . . . . . 26
3.8.2 Time Windows . . . . . . . . . . . . . . . . . . 26
3.8.3 Sizes Match . . . . . . . . . . . . . . . . . . . . 26
3.8.4 Operations Follow Eachother . . . . . . . . . . 27
3.8.5 Constraints on Truck Types . . . . . . . . . . . 28
3.8.6 Staging Area Capacities Met . . . . . . . . . . 28
3.8.7 Trucks end where they Start . . . . . . . . . . . 28
3.9 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.9.1 City . . . . . . . . . . . . . . . . . . . . . . . . 29
3.9.2 Solution . . . . . . . . . . . . . . . . . . . . . . 31
3.9.3 Objective Values . . . . . . . . . . . . . . . . . 31
4 Simulator 32
4.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Failed Attempts and Lessons Learned . . . . . . . . . 32
4.2.1 Expressing Waiting . . . . . . . . . . . . . . . . 32
4.2.2 Matrix Dimension . . . . . . . . . . . . . . . . 33
4.2.3 Distribution of City Parameters . . . . . . . . . 33
4.3 Simulator Correctness . . . . . . . . . . . . . . . . . . 34
3
5 Robustness 34
5.1 Changing Request Locations . . . . . . . . . . . . . . . 35
5.1.1 Goal . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1.2 The Process . . . . . . . . . . . . . . . . . . . . 35
5.1.3 Visual Robustness . . . . . . . . . . . . . . . . 36
5.1.4 Results and Interpretation . . . . . . . . . . . . 39
5.1.5 Other Thoughts . . . . . . . . . . . . . . . . . . 40
5.2 Flexible Dumpster Sizes . . . . . . . . . . . . . . . . . 40
5.3 Changing Times . . . . . . . . . . . . . . . . . . . . . 41
5.4 The Solver . . . . . . . . . . . . . . . . . . . . . . . . . 42
6 Conclusion 47
A City Statistics 48
4
1 Introduction
Simulation is an integral part of any mathematical modeling, espe-
cially when solving real-world problems. Without a standard process
to simulate solutions, those designing and implementing models would
be unable to verify their efficacy.
Our modeling is for Sam’s Hauling, who requires an efficient method
to schedule their drivers to service customer requests. Currently this
is done by hand, and our job was to assist other teams in creating
algorithms to accomplish this task by simulating their results. Thus,
the Simulation Team’s goals were to produce a reasonably efficient way
to test solutions given by the other teams who were developing math-
ematical models to solve Sam’s Hauling’s delivery scheduling. We met
our goals on time and created a software package that is not only robust
but is capable of simulating many different situations and occurrences.
1.1 Problem description
Sam’s Hauling Inc. provides a small to medium sized dumpster rental
service to customers in the Denver Metro Area. They first deliver these
dumpsters, or containers, to customers who fill them. Then Sam’s
Hauling must return to collect the full dumpsters and take them to
a dump for disposal. Therefore, drivers must constantly be delivering
new containers, picking up full containers, and dropping off trash at
dumps. There are also staging areas where drivers begin their day, and
where empty dumpsters are stored. These have capacities and initial
inventories that must also be considered. Currently, they schedule
their drivers by hand, and are looking for a new method to optimally
5
schedule their pickup and delivery routes.
1.2 Overview of our Role
We created three pieces of software, using MATLAB, for other groups
to use in testing their solutions:
• A city generator that creates random stops, dumps, and staging
areas with which others can experiment
• A simulation function that simulates how well a solution satis-
fies several of these requests, and output its efficiency based on
various metrics
• A translate function that takes data from the User Interface team
and transforms it into MATLAB
These three pieces of software have been used by the other teams in
testing their proposed solutions. Our goal was to provide an objective
means by which others can efficiently test their proposed solutions,
compare feasibility and optimality of said solutions, and give a good
estimate of the sensitivity of each solution to changes. We believe that
we have met our goals.
1.3 Goals
As a team, our objective was to create a software package that could
simulate routes through cities to output their efficiency using vari-
ous metrics. We did not set out with the goal of stating our opinion
on whether these metrics were “good" or “bad," but instead simply
6
reported the data to those using the software and allowed them the
freedom to interpret the results independently.
We also had the objective of making our software user friendly;
part of this was to allow easy translation between Excel (what the
User Interface team ended up using) and MATLAB (our chosen plat-
form). This was a necessary goal in bridging the gap between the two
platforms, and without this the teams would have to manually trans-
late the information - which would be hardly “user friendly." In the
same vein as user friendliness, we also set out with the goal of creating
a tutorial / manual for the class to use so that they could better un-
derstand the features and capabilities of our simulator. This tutorial
was completed and presented to the class as a whole, and was available
for download along with our simulator.
1.4 The Value of the Simulator
Our software simulates the existing vehicle routing problems, as well
as taking inventory constraints on the containers in each staging area
into account. This is a critical part of the solution because without
the simulator, proposed solutions would not be as rigorously tested.
For example, the simulator ensures that a solution satisfies inventory
constraints through out the day, not just the ending inventories which
is what several other teams modelled.
A working simulator was integral in allowing the others teams to
test their proposed solutions and data to measure their efficiency and
accuracy; without the simulator, this would have had to be done by
hand.
7
Our plan was to create a working software package as early as
possible for every group to begin using. A working version of the
simulator was provided to the class on 30 October 2014, giving the class
more than a month to work with the software to test their solutions.
Along with this, we also presented our tutorial.
We also used the simulator to gain a deeper understanding of the
problem. We tested different assumptions made in the class, like con-
stant drive times; and tested solution robustness to changes in requests.
Finally, we created an iterative solution method that attempts to solve
the entire problem.
2 MATLAB Model
There are many parts to the model that we have created, as detailed
below.
2.1 MATLAB Object Model
2.1.1 City Representation
Cities are objects that store multiple pieces of information. The city
stores all of the data about drive times, stops, customer requests, land-
fills, staging areas, trucks, etc. It is an entire set of data that makes
up the problem statement.
Actions A city contains actions, also called stops. There are mul-
tiple actions for any city. An action is simply something a driver can
do. An action has several main parts:
8
The Operation
This can be P (pickup), D (dropoff), R (replace), S (stage), U
(unstage), or E (empty).
The In-size
This is the size of the dumpster that is being brought to the
action or stop, as a numerical value between of 0, 6, 9, 12, or 16.
The 0 represents no dumpster and the others represent the four
sizes of containers we are dealing with.
The Out-size
This is the same thing as in-size, but it is what size of dumpster
the driver is supposed to leave with.
The Start-time
This is the time, in seconds, when this action can start being
performed; for example, if the start-time is 10,000, this means
that 10,000 seconds must elapse during the simulation before
this action can be performed.
The Stop-time
This is the time, in seconds, when an action must be performed
by; together with start-time, these model time window constraints.
The Wait-time
This is the amount of time it takes to actually complete the action
(note that this is different from the travel time it takes to get to
the location, which is modeled in a different place).
The Location
This is where the action is actually located; it is an index into the
array of locations that we have, telling us which location has this
9
particular action. It is also an index into the matrix of distances
and durations between locations.
Example Different locations have different kinds of actions. For
example, staging areas have 8 actions each, as shown in this table:
STAGING AREA Operation In-size Out-size
ACTION 1 STAGE 6 0
ACTION 2 STAGE 9 0
ACTION 3 STAGE 12 0
ACTION 4 STAGE 16 0
ACTION 5 UNSTAGE 0 6
ACTION 6 UNSTAGE 0 9
ACTION 7 UNSTAGE 0 12
ACTION 8 UNSTAGE 0 16
If you want to unstage (pickup) a size 12 dumpster, you must start
with an empty truck and leave with a size 12 dumpster. Thus, you
would use the seventh action, which means your in-size is 0 and your
out-size is 12. Having multiple actions is needed because only giving
a location is ambiguous when a truck travels between staging areas.
Similarly, landfills have four actions each, one for each kind of
dumpster you will be bringing there to empty. Customer requests
only have one action each, as they will always have a predetermined
in-size and out-size. Now that we have covered cities, let us look at
solutions.
10
2.1.2 Solution Representation
The user provides a solution represented by a matrix. Each row repre-
sents a driver’s route and each corresponding column entry a driver’s
row is what that driver will do in order. A negative one means that
the driver does nothing: he is at the end of his route. The following is
a solution matrix.





2 4 19 22 −1 −1 −1
10 17 19 44 11 13 5
6 19 −1 −1 −1 −1 −1





Each row is a driver; thus, this city has three drivers. Driver one
performs action 2, then 4, then 19, then 22, and then he is done for
the day. Similarly, driver three performs actions 6 and 19, in that
order, then finishes his day. Driver two is the busiest, obviously, but
his actions are just as easily read.
As you can see, it was necessary to encode what kinds of dumpsters
the driver was dropping off, picking up, etc. with the in-size and
out-size, so that this matrix would be feasible. This encoding of the
solution into a matrix is why there are so many actions associated with
staging areas and landfills. As an example, imagine that stop 19 is a
staging area. Because of how the actions are structured, given any
city, we will know exactly what kind of action the driver is performing
at that staging area. Actions 18-25 may all be at that same staging
area, but they all mean different things.
11
2.2 Route Simulation
The function to simulate a given city has the following signature:
function [feasible, times, distances, number_serviced, fees, inventories]
= simulate(c, sol, v, checkall)
Parameters
c
The city object
sol
The solution matrix
v (Optional)
True if errors should be printed
checkall (Optional)
True if all constraints should be checked, even when one is vio-
lated
Return Values The simulate function takes in a city and a solution
matrix as arguments, and outputs these metrics:
feasible
False if the solution encountered an error that makes it a non-
viable solution; it returns true otherwise
times
A vector containing all of the times it took each driver to com-
plete his assigned route, based on the solution matrix given
12
distances
A vector containing all of the distances traveled by each driver
based on his assigned route
number_serviced
How many of your customer requests were actually completed
fees
How much in fees you accrued on this route, based on the costs
of landfills
inventories
How many dumpsters remain at each staging area at the end of
the day
Some solutions will contain errors that the simulator will recognize
as non-feasible solutions. It will return the variable feasible = false if
this is the case. Many things will cause it to return infeasible. For
example, if a driver visits a landfill followed by another landfill, this
makes no sense and it will return infeasible. Or, if a driver visits a
staging area to get a size 6 dumpster, but that staging area has none
in inventory, this will also return infeasible. If you want a detailed list
of all the constraints refer to the MATLAB/src directory, or to the
model in the next section.
Unless instructed otherwise, the simulator will continue to run to
the best of its ability when dealing with a non-viable solution. The
errors it finds will be displayed in the main window of MATLAB.
Our suite of functions also comes with a generate_rand_solution
function that takes a city as an argument. However, beware, this
random solution is rarely viable.
13
2.3 City Generation
function [c] = generate_city(R, L, Y, D)
The function generate_city(R, L, Y, D) generates a city at random,
based on these arguments:
Parameters
R The number of customer requests to be generated
L The number of landfills to be generated
Y The number of staging areas, or yards, to be generated
D The number of drivers that work in that city
Return Value
c A randomly generated city
This function creates a random set of customer requests, landfills,
staging areas, trucks, etc. for us to run a simulation on. Everything is
random, so that you can create a diverse set of cities to run simulations
through.
2.4 City Translation
function [c] = translate(dirname)
This function takes the output from the user interface team (who
are working with Excel) and translates it into our city structure in
MATLAB. It also finds the coordinates of the addresses, to plot the
city.
14
For example, sample data given by the UI team on Canvas is also in
our repository under /MATLAB/test/example_ui_data. If you want
a city based on this data, you can simply enter the MATLAB com-
mand c = translate(’test/example_ui_data’); This city will represent
the data found in this city. Translate expects to see five files in the
directory as given: output1.txt, output2.txt, output3.txt, output4.txt,
and output5.txt.
Because this uses a Google API to find a location from the address
the UI team gives, you must be connected to the internet to run this.
3 Model
The MATLAB code can also be expressed mathematically. We will
describe the mathematical model that this simulator attempts to sim-
ulate. This includes detailing some of the assumptions that went into
this model along with listing the parameters and variables for the
model. Finally, we will list the constraints and objectives.
3.1 Model Assumptions
Several assumptions were made to simplify the model. For exam-
ple, our model only takes trucks into account without worrying about
drivers. We see that drivers could be incorporated into the model
by fixing a number of drivers and only letting this many trucks be
anywhere but the start location at any given time.
Another assumption that went into our model is that when a cus-
tomer requests a certain size dumpster, the only feasible solutions are
to give that customer exactly the dumpster that was requested. This
15
is more strict than the problem statement in which a large dumpster
can be provided.
We provide arbitrary time windows that are more general than
needed, because we only needed to have AM, PM, and OPEN time
windows. However we chose to include them in the model because
Sam’s Hauling mentioned that some customers do request pickups or
deliveries in specific time windows.
Also, drive times and wait times are assumed to be constant through-
out the day. This may or may not be a reasonable assumption.
We were not sure if we should include actions to be performed at
staging areas that allowed drivers to drop-off a container and pickup
a container in the same action. We decided against this because it
greatly increases the number of actions. Instead, solutions are required
to first drop-off a dumpster and then pickup the next one up.
3.2 Model Decisions
3.2.1 Solutions Use Indices
At first, we wanted to test constraints by having the solution be a ma-
trix that includes all of the request information like the location, oper-
ation, and time windows. After discussions with the group, however,
we figured out another way to represent all of the solution information
by only using indices into the City object. This removes duplicate data
between the city and solution, so that it is harder to give a solution
matrix that is inconsistent with the city.
16
3.3 City Streets
Originally, there was also an idea of coding every city street into the
city objects, so that there were multiple ways to reach every desti-
nation. This would simulate a real city’s data. After analyzing this
method we determined that each driver would simply choose the short-
est path if given the option, so we would just have the shortest path
connecting every location. This ended up simplifying the problem, and
allowed the simulator better performance.
3.3.1 Modeling Language
We chose to create the simulator in MATLAB because this would make
it easiest for other teams to use. Although it may not have been the
most efficient language, we could count on other teams being able to
use it. We had several concerns about using MATLAB, like whether it
supports objects. In the end, it turned out to be fairly simple to use.
3.3.2 Simulator output
We tried to return output that could be used in several different ways.
For example, we returned the times of each driver, rather than sum-
ming them or taking the maximum time of each driver. This means
that other other teams could combine this information as they wished.
For example, consider the following two solutions. The numbers
like “0 ->6" represent that the request at that location is a pick up of
a size 6 dumpster. Different colors represent different drivers.
17
Our solver generated these two solutions s1 and s2 with the follow-
ing drive times:
times for s1:
26259 26927 26691 25757 26812 26035 26599 26859 26848
times for s2:
0 0 0 63056 0 0 24699 0 125832 41048
The sum of times for the second is 254635 which is less than the
sum for the first of 265210. However, the maximum time is 26927 for
the second while it is 125832. It is not clear how to combine these
different objectives, and we did not want to decide what other teams
should maximize.
3.4 Model Variables
First, we will define the variables used in the model. While giving
the symbols and their dimensions, we will try to use the following
convention for indices. For convenience, unless otherwise noted, i will
range from 1 to n, j will range from 1 to m, k will range from 1 to
Y , d will range from 1 to D, t will range from 1 to |T|, and l will be
18
another index. All times are measured in seconds.
19
3.4.1 Parameters
Variable name Description
L L ≥ 0 # of Landfills
Y Y ≥ 0 # of Staging Areas (or Yards)
R R ≥ 0 # of customer requests
n n ≥ 0 # of actions, or stops
m 0 ≤ m ≤ n # of unique locations
D D ≥ 0 # number of trucks (or Drivers)
S For Sam’s Hauling, |S| = 5 Set of dumpster sizes
For Sam’s Hauling, we will let
S = {
‘6’
‘9’
‘12’
‘16’
‘No Dumpster’
}
T For Sam’s Hauling, |T| = 3 Set of possible Truck types
For Sam’s Hauling, we will let
T = {
‘small’
‘medium’
‘large’
}
20
Variable name Description
O |O| = 6 Set of operations
O ={
‘D’: deliver a dumpster
‘P’: pickup a dumpster
‘R’: replace a dumpster with a different one
‘E’: throw away a dumpster at a landfill
‘S’: Stage a dumpster
‘U’: Unstage a dumpster
}
3.4.2 The City
Variable name Description
I 1 ≤ I ≤ m, I ∈ {sk}Y
k=1 The starting index of all trucks.
For each stop
(Tbegin
i , Tend
i ) 1 ≤ i ≤ n, 0 ≤ Tbegin
i < Tend
i Time windows when stop i is possible
Wi 1 ≤ i ≤ n The wait time required to visit stop i
oi 1 ≤ i ≤ n Operation to be performed at stop i
(Sin
i , Sout
i ) 1 ≤ i ≤ n The in/out dumpster sizes of each action.
If oi =‘S’, then Sout
i = No Dumpster
li 1 ≤ i ≤ n, 1 ≤ li ≤ m The locations associated with each stop
ci,t 1 ≤ i ≤ n, 1 ≤ t ≤ |T| Constraints on truck size
ci,t ∈ {0, 1} 1 if action i is accessible by truck type t
For example, we would like to set ci,t = 0
when oi = R and Sout
i = 16
21
Variable name Description
For each location
tj,l 1 ≤ j, l ≤ m, di,j ≥ 0 Time to get from location j to l
dj,l 1 ≤ j, l ≤ m, fi,j ≥ 0 Distance, between location j to l
For each truck
td 1 ≤ d ≤ D, ti ∈ T Truck type of truck d
For each staging area
Ik,s 1 ≤ k ≤ Y, s ∈ S  {‘No Dumpster’} Initial # of dumpsters
At the beginning of the day.
Ck 1 ≤ k ≤ Y, Ck ≥ 0 Max capacity staging area k
Obviously, j∈S Ik,j ≤ Ck
sk 1 ≤ k ≤ Y, 1 ≤ si ≤ m location of staging area k
For each Land fill
Fl 1 ≤ l ≤ L The fee associated with landfill l
el 1 ≤ el ≤ m, 1 ≤ l ≤ L The location of landfill l
3.5 Other thoughts
In MATLAB, the parameters L, Y , R, D are required to generate a
random city in addition to several distribution parameters. A ‘city’
22
will be encapsulated by the rest.
There are R different actions to represent the requests, because
there is only one action associated to each. Each landfill has a different
action for each dumpster size: one action where the in and out size is
that dumpster size. Each staging area has actions allowing a truck to
drop off each size of dumpster: the actions have ‘No Dumpster’ for the
in size. Finally, there is one action for picking up each dumpster size.
That is, the total number of actions is n = R + (L + 2Y )(|S| − 1).
3.6 Decision Variables
The solution to be simulated is given by the user. It is represented
by a matrix with D rows and an arbitrary number of columns. Each
row xd, 1 ≤ d ≤ D will be a permutation vector of the stops to be
performed by driver d (followed by −1’s). Let us name their lengths
rd := length(xd) ≤ n. We interpret the l-th element of xd (which is
denoted xd,l) as the l-th stop to be performed by truck d. For example,
if oxd,l
= ‘S’, then the l-th stop by driver d is a staging operation at a
storage yard.
3.7 Objectives
Let us make some of the following equations simpler with these defini-
tions. Let t(x, y) = tx,y, so we have fewer subscripts. For convenience
of later sections, let us define a function a(d, k) that represents the
accrued time that truck d takes to complete its k-th stop. (We have
23
assumed that 1 ≤ d ≤ D, 1 ≤ k ≤ ri.) This is given by:
a(d, l) := t(I, xd,1) +
l
j=1
Wxd,j
+
l−1
j=1
t(lxd,j
, lxd,j+1
).
That is, the time from the start location plus the times to travel be-
tween the stops, plus the time at each stop.
We will simulate this as a multi-objective problem. We include the
number of requests serviced, the time taken to do so, the total distance
covered, and the amount of fees accrued. We could use a lexicographic
ordering to order these (If we have a maximum time).
3.7.1 Number of Requests Serviced
The total number of requests serviced is
N =
D
d=1
rd
j=1
oj∈{‘D’,‘P’,‘R’}
1.
3.7.2 Time Taken to Service Requests
One way to model the total time could be
Ttotal = maxD
d=1{a(d, rd)}
which is the time of the longest route and would represent the amount
of time before all routes were completed. Another is to represent the
total number of man-hours spent, which would instead be a sum of all
times:
Ttotal =
D
d=1
a(d, rd).
It would be better to simply calculate the difference at each stop
and its time window, and try to minimize the errors (or customer wait
24
times/early inconveniences). What the simulator does is return an
entire vector of time-costs associated with each driver.
3.7.3 Distance Covered
The total distance driven by all drivers is given by
D
d=1
d(I, xd,1) +
l−1
j=1
d(lxd,j
, lxd,j+1
).
3.7.4 Landfill Fees
The fees accrued by all drivers is given by
D
d=1
rd
j=1
{
0 if oxd,j
= ‘E’
Fl where el = lxd,j
3.7.5 Remaining Inventories
The inventories remaining at the end of the day can be used to ensure
that dumpsters are equally (or otherwise) spread out among staging
areas. This could be used to ensure that dumpsters at staging areas
are accessible for the next day. Also, it is mentioned in the prob-
lem description that some staging areas might not be allowed to have
dumpsters overnight. Although we would need another parameter in
the model to dictate which staging areas these are, that would likely
make this objective a constraint.
25
3.8 Constraints
3.8.1 Driver Routes must not overlap
We don’t want to visit the same request twice, so for all 1 ≤ d, d ≤ D
and all 1 ≤ j, j ≤ ri, at least one of the following 4 statements must
be true:
1: d = d and j = j
2: oxd,j
∈ {E, S, U}
3: oxd ,j
∈ {E, S, U}
4: xd,j = xd ,j
This means that requests cannot be serviced by multiple drivers or
twice by the same driver.
3.8.2 Time Windows
For each 1 ≤ i ≤ D and each 1 ≤ k ≤ ri we need
Tbegin
xi,k
≤ a(i, k) ≤ Tend
xi,k
.
If we include the Tmax variable, we will need to ensure that a(d, rd) ≤
Tmax for all 1 ≤ d ≤ D
3.8.3 Sizes Match
For all 1 ≤ d ≤ D, and for all 1 ≤ j ≤ rd − 1, we have
Sout
xd,j
= Sin
xd,j+1
.
For all 1 ≤ d ≤ D, we have
Sin
xd,1 = ‘No Dumpster’
as an initial constraint.
26
These mean that if a truck leaves a stop with a 9 dumpster, then
he arrives at the next location with a size 9 dumpster. We also assume
that all trucks start out with no dumpsters.
In the problem description, there is a statement that dumpsters
larger than the one being requested can also be used. We have not
incorporated this into our model.
3.8.4 Operations Follow Eachother
A driver cannot service two pickup requests in a row without visiting
a landfill. This means there is a constraint on the which actions can
follow each other. We can tell if an action is allowed to follow another
based on the actions’ operations.
For all 1 ≤ d ≤ D, and for all 1 ≤ j ≤ ri − 1, we need that
follows(od,j, od,j+1) is true, where the follows predicate has the fol-
lowing truth table. Read this table as operation in row d can follow
the operation in column j if follows(i, j) = T.
follows ‘D’ ‘P’ ‘R’ ‘T’ ‘S’ ‘U’
‘D’ F F F T F T
‘P’ T F F F T F
‘R’ F F F T F T
‘T’ F T T F F F
‘S’ F F F T F T
‘U’ T F F F T F
Another way of modeling this constraint would be to have a dump-
ster state of full, or empty (which is not the same as no dumpster)
for each stop/action. Then we would just need a ‘truck state matches’
constraint, which would be as simple as the ‘sizes match’ constraint.
27
3.8.5 Constraints on Truck Types
We need a constraint that says a truck cannot service a request that
requires a different type of truck. For each 1 ≤ d ≤ D and each
1 ≤ j ≤ rd we need
cxd,j,td
= 1.
3.8.6 Staging Area Capacities Met
Let b(d, t) (b for bound) be the largest index k, 1 ≤ k ≤ ri such that
a(d, k) < t. In other words, this is the a index into the d-th truck’s
route that gives its last stop before time t (t ∈ R, 0 ≤ t < ∞).
Then, we want that for each 1 ≤ y ≤ Y , and each s ∈ S 
{‘No Dumpster’}, and each t ∈ R, 0 ≤ t < ∞,
0 ≤ Iy,s +
D
d=1
b(d,t)
j=1
oxd,j
=‘U’
Sout
xd,j
=s
sy=lxd,j
(−1) +
D
d=1
b(d,t)
j=1
oxd,j
=‘S’
Sin
xd,j
=s
sy=lxd,j
(1) ≤ Ci.
This assumes that each dumpster takes up the same amount of
space in the staging area. If not, we could use a weighted sum, where
we replace the ±1 with a weight based on s (the coefficients would
need to be another parameter to the model).
3.8.7 Trucks end where they Start
There is a constraint that says each truck must stop at the staging
area it starts at. That is, for all 1 ≤ d ≤ D, we have lxd,rd
= I.
28
3.9 Example
This is an example city with L = 1, Y = 1, R = 2, D = 2, S =
{9 , 12 , 16 , No Dumpster}, T = {small, large}, n = 11, m = 4.
3.9.1 City
The set of possible stops in a stop couldbe given by the following table:
index window begin window end wait time operation
1 0 100 5 ‘T’
2 0 100 5 ‘T’
3 0 100 5 ‘T’
4 0 100 2 ‘U’
5 0 100 2 ‘U’
6 0 100 2 ‘U’
7 0 100 2 ‘S’
8 0 100 2 ‘S’
9 0 100 2 ‘S’
10 0 50 1 ‘R’
11 0 100 1 ‘D’
29
index in size out size location
1 9 9 1
2 12 12 1
3 16 16 1
4 No Dumpster 9 2
5 No Dumpster 12 2
6 No Dumpster 16 2
7 9 No Dumpster 2
8 12 No Dumpster 2
9 16 No Dumpster 2
10 9 16 3
11 16 No Dumpster 4
The following table might describe truck types:
index starting location type
1 2 small
2 2 large
Further suppose that the distances are given by:
0 1 3 2
1 0 2 4
3 .5 0 1
2 1 3 0
The truck constraints might be given by:
index small large
1 1 1
2 1 1
3 0 1
4 1 1
Finally, there is only one staging area:
30
index capacity location
0 10 1
This single staging area will be the initial location: I = 6.
3.9.2 Solution
The following two are possible solutions, although the second does not
end where it starts.
1. x1 = (), x2 = () (In this case, r1 = r2 = 0).
2. x1 = (), x2 = (6, 11, 4, 7) (In this case, r1 = 0, r2 = 4).
However, this solution is not feasible, because the sizes don’t match
in the last stop of the second truck: x1 = (), x2 = (6, 11, 4, 8) (In this
case, r1 = 0,r2 = 4).
3.9.3 Objective Values
The time for the first is 0, which is also the number of requests serviced.
The number of requests serviced in the second is 1, and the time for
the second truck a(2, 4) is:
dl6,l6 + W6 + dl6,l11 + W11 + dl11,l4 + W4 + dl4,l7 + W7
= d2,2 + W6 + d2,4 + W11 + d4,2 + W4 + d2,2 + W7
= 0 + 2 + 4 + 1 + 1 + 2 + 0 + 2
= 12.
Thus, Ttotal = 12 as well.
31
4 Simulator
4.1 Performance
Currently, the two most expensive functions of the simulator are the
check that inventory bounds are satisfied and a helper function that
computes the times each request is performed. Which of these methods
is most time consuming depends on how close to valid the solution is.
MATLAB provided a summary how much time was spent in each
method. We found that when the solution is not close to valid and the
simulator does not need to check all constraints, the most expensive
method was the one used to find the time when each request was
serviced. For example, out of several runs, this took 2.5 seconds out
of 7 seconds total. When all constraints had to be checked, we found
that checking the inventories was the most expensive. This method
took 6 seconds out of 16 seconds.
If we were to optimize our simulator further, these would likely be
where we could find the most speedup.
4.2 Failed Attempts and Lessons Learned
4.2.1 Expressing Waiting
We initially created the simulator without the ability to have a driver
wait without performing any action. This caused a problem because
drivers were not able to wait for a time window to begin, and meant
that some cities had no solutions that we could represent with our
model. We fixed this by automatically forcing drivers to wait for the
beginning of the time window of each request. For example, suppose a
32
driver arrives at a request at 9:00am that has a time window from noon
to 6:00 pm. The simulator automatically forces the driver to wait until
noon before moving on. This was mainly a problem for small cities.
4.2.2 Matrix Dimension
Some drivers did not have the same number of actions assigned to
them, even if they took the same amount of time or longer than a
driver with more actions. We learned that less stops did not necessarily
mean less productivity or efficiency.
When first creating a solution matrix, our assumption was that the
matrix would always be size DxN, where D was the number of drivers
in the city and N was the number of actions possible in the city. This
turned out to be false because drivers could repeat actions, meaning
that the matrix dimensions could not be determined ahead of time.
4.2.3 Distribution of City Parameters
Our simulator tries to replicate real life cities and routes in which a
driver or multiple drivers drops off and picks up dumpsters; therefore,
we have tried to make this as close to life as possible, while keeping it
simple enough to use. It is our belief then that this corresponds directly
to any real world problem that Sam’s Hauling might encounter, and
any variance to the real world outcome is an error in the software but
not an error in the design or theory of our project.
Originally, the simulator did not generate realistic cities. For exam-
ple, the distribution of wait times did not match example data from
Sam’s Hauling. With sample data, we modified the generate_city
function so that the cities better represented real data.
33
One way of doing this is to randomly select locations from the
sample data, and construct cities from this. One disadvantage of this
is the distances between all stops must be computed. An alternative
way is to first simply come up with distributions of several important
characteristics of the sample data, like the distribution of drive times.
Then, more drive times can be generated that follow the same mean,
variance, or possibly skew of these distributions.
This is the way we chose, so we found the distribution of dump-
ster sizes, operations, and time windows. These are contained in the
appendix. Other important statistics would have been the number of
truck type constraints and drive times.
4.3 Simulator Correctness
While writing the simulator we tested several aspects to make sure
that the code was correct. We also included around 20 tests that ran
predetermined cities through the simulator to ensure the simulator was
accurate as the code changed.
5 Robustness
There are more subtle ways that the simulator may be able to capture
a solution’s quality. Several of these measures revolve around the idea
of robustness.
34
5.1 Changing Request Locations
5.1.1 Goal
Because customers call in and change their requests through out the
day, we were interested in figuring out if some solutions handle changes
better than others. In order to answer this, we first created a way to
compare how robust two solutions are with respect to request changes.
Then we created several random cities and used this comparison to
see if there were drastic differences in robustness between two good
solutions. If there are big differences in the robustness of several good
solutions, then we would have concluded that a good algorithm must
not only have an efficient solution, but that solution must also be
robust.
5.1.2 The Process
For our considerations, we defined the difference in robustness of two
solutions s1 and s2 according to the following process.
1. Pick a random time t between 0 and the minimum time it takes
either solution to finish.
2. Create a new city from the original city by removing all the re-
quests that have already been serviced by s1 before time t. Also,
update the starting locations for each of these drivers, so that
each driver starts where it would have been if it were at time t
in s1 for the original city. Create the corresponding sub-city for
s2 at time t.
3. Randomly change the location of 1
4 (an arbitrary percentage) of
35
the remaining requests for each of these two sub cities (with a
uniform distribution).
4. Find optimal solutions for the sub-cities, and compare the times
for these optimal solutions.
In the previous process, steps two and three are called creating a
sub-city problem from the original problem. Of course, this process
requires that a city has multiple start locations that don’t have to be
at a staging areas. Also, now trucks can start with a full dumpster.
It is worth noting that the remaining requests will likely be different
for these two cities (unless s1 = s2), so different requests are changed in
each sub-city. This observation also brings out that the faster solution
will likely have fewer remaining requests. This gives it an advantage,
although in general we noticed several cases where the slower solution
did provide faster subcity solutions. An alternative could be to simply
select a subset of requests from the original city to be changed, and
then not change the requests in the subcities that are already serviced.
5.1.3 Visual Robustness
We can see how this process works with some pictures. It is easier
to see differences in the pictures with fewer requests, so we choose a
city with 28 requests and 3 drivers. This is the original city, where
a start_X is the start location for driver X and stop_X is his end
location.
36
Two alternate solutions are given here. They are both local mini-
mum with respect to a very primitive local neighbor search described
later.
37
Next we found both maximum times of each solution’s drivers, and
chose a time between 0 and the minimum of either of these times.
After cutting the first solution at this time, we applied mutations to
the first sub-city to get the city on the right from the original sub-city
on the left.
Finally, we have the solutions to the mutated sub-cities.
38
The goal is to create these last two images several times, and see
if the subcity solutions for one solution are usually better than the
subcity solutions for the other.
5.1.4 Results and Interpretation
Initially we ran this test for several different initial cities, generating
the two alternative solutions to be compared each time. However,
we think it is more natural to take a particular city and construct
several different subcities for the same city, only generating the two
alternatives once.
The test we performed compared the average of the two time dis-
tributions of subcity solutions. We performed the Welch’s t-Test to
compare these two means. The two sub-city times were normalized by
the time that it took the original solutions to solve the remaining sub-
cities. After 20 sub-city solutions for an initial city with 100 requests
and 5 depots, we were not able to find a difference in the means. The
Welch t-Test gave a statistic of t = 0.0193 with 37.656 degrees of free-
dom. This gave a p-value of p = 0.9847, so we cannot conclude that
39
either of the two initial solutions provided subcities whose solutions
were faster than the other on average.
5.1.5 Other Thoughts
We also applied a Kolmogorov-Smirnov test. This test doesn’t just
compare the means, but also checks if the two distributions are dif-
ferent in any way. This did find a significant difference, which means
that there may be other differences aside from the mean that are worth
considering. There may also be more differences in solutions when so-
lutions are highly constrained by inventories (cities with small initial
inventories). Another test may be to see how much of a difference the
ending inventories have on the next day.
There is one problem with this test. If the driver is waiting at a
landfill, with say 20 minutes left after the cutting time, he no longer
has to wait that 20 minutes in the sub-city. That is the remaining wait
times at landfills are cut off when we construct the sub-city. We hope
that this does not have a strong impact on the tests results.
Of course, we are comparing random good solutions, without trying
to find robust solutions. The differences might be bigger if we could
actually search for robust solutions instead of any two good solutions
with respect to another metric. It could be better to do this test several
times and take the maximum of differences in robustness.
5.2 Flexible Dumpster Sizes
One fairly simple modification to the simulator allows flexible dump-
ster sizes. We expanded our model to include multiple actions per
40
request. The different actions have different sizes, so that solutions
are allowed to deliver a size 9 dumpster when the request was for a
size 6 dumpster. Then the constraints on requests not being serviced
multiple times were updated to check the request number instead of
the action number. We performed another Welch t-test to compare
the mean time of solutions for a city where the dumpster size con-
straints have to be met, and the mean time of solutions with this
relaxed constraint. After solving 100 ramdomly generated cities, we
found a difference in the mean with a p-value of p = 0.01475.
Because these solutions are faster, if this is allowed by Sam’s Haul-
ing it would be a good idea to include this in the algorithm. We may
add penalties to this based on the extra cost of dumping larger dump-
sters. How important this is to solution times obviously depends on
what the distribution of dumpster size requests are in the first place.
5.3 Changing Times
One of the assumptions made in the class was that drive times do not
depend on the time of day. We collected drive times through different
times of the day, and hoped to test if this assumption is valid or if
simulation output changes drastically when we include dynamic drive
times.
At first, we hoped to include one test done in the same way as
changing the request locations, only rather than simulating a solution
part way through, we just see whether some solutions work better with
realistic, changing drive times. However, it seems like the solutions are
simply scaled to take longer based on the variance in times through
41
out the day, without some being scaled more than others.
5.4 The Solver
The solver we used to conduct the previous tests used a simple local
search. An initial solution was generated randomly in the following
way. Each truck started at the depot and then the route that takes
the least amount of time is given another request to service until all
requests are serviced. The next request is chosen with a decreasing
probably based on how long it would take the truck to get to the next
request. (First the time to get to all other requests is found, and then
the closest is chosen 1
2 of the time, the second closest is chosen 1
4 of
the time and so on.)
Then, once an initial solution is found, we start applying opera-
tions to that solution as long as these operations improve the solution.
The operations we applied were to exchange any two sub-paths of two
drivers. For example, we can represent a solution as a list of routes,
each row being the indices of the actions each driver does. One solution
might be the following:
45 25 14 13 53 34 14
1 15 76 24 11 43 78
This would mean that the first driver performed action 45, then
25, then 14 and so on. Now, we can select a sub-path of each driver.
Suppose we selected the following bracketed sub-paths:
45 [25 14 13 53] 34 14
1 15 76 [24 11 43] 78
42
We can then exchange these to find the following solution:
45 [24 11 43] 34 14
1 15 76 [25 14 13 53] 78
Now, this solution is most likely not feasible. Initially, we solved
this problem by only exchanging sub-paths that had the same dumpster
state of the truck. The dumpster state of a truck is defined as the
pair consisting of the dumpster size and whether the truck is full. So
any sub-path that begins with an empty truck state and ends with a
full truck of size 6 can be exchanged with any other such sub-path.
However, we found that this was too limiting, because some of the
solutions generated had sub-paths that were visibly poor.
To expand the ways we can change a solution we started by only
considered sub-paths that began and ended with a request. Then,
when we exchanged sub-paths, we over wrote all the actions that took
place at staging areas and landfills before and after the request. So,
in our previous example, suppose that actions 45, 34, 15, 76, and 78
were at landfills or staging areas. The curly-bracketed actions are the
actions before and after our sub-paths that occur at landfills or staging
areas. The square-bracketed actions are sub-paths that begin and end
with requests. An X means we temporarily ignore the action in that
location. Then we could perform the following exchange:
select sub-paths:
{45} [25 14 13 53]{34} 14
1 {15 76}[24 11 43] {78}
only consider requests
43
{X} [25 14 13 53] {X} 14
1 {X}[24 11 43] {X}
swap
{X} [24 11 43] {X} 14
1 {X} [25 14 13 53] {X}
fill in gaps
38 [24 11 43] 54 14
1 17 [25 14 13 53] 32
The 38, 54, 17, and 32 are new actions at staging areas or landfills
that are used to fill in the gaps between requests. They are chosen so
that the time between requests is minimized and such that the solution
is once again feasible. Sub-paths can have 0 length, so that we can
“cut" a sub-path from one route and “insert" it into another.
There is no need to make the sub-paths being exchanged have the
same length, because this should be handled by the objective function.
Also, the length of the sub-path does not determine the time it takes
to complete the sub-path.
The output consists of the operations applied to the initial solution
to obtain the local minimum:
...
exchange [driver:5,begin=2,end=10] with [driver:6,begin= 0,end=11]
exchange [driver:6,begin=9,end=14] with [driver:7,begin= 6,end=11]
exchange [driver:0,begin=1,end= 2] with [driver:1,begin=15,end=17]
...
44
Pictorially, one operation could take the first solution to the second.
In this case a sub-path of the light green was exchanged with a sub-
path the red.
With around 70 requests, 4 depots and 8 drivers (like the one
shown) this process could converge in around 300 operations. Then
this process of picking a random city, and apply a local search to it
was repeated several times.
A visual representation of how well this worked is that it would
take random cities like the one on the left, and provide ones on the
right.
We maximized with respect to the number of drivers squared time
the maximum time of any driver plus the the of times of each driver.
45
This meant that the most important feature was the maximum time
of any driver, but all ties were broken by the sum of total times.
Considering an overtime metric did not work as well, because it limited
how many operations could be done to improve the solution. It would
typically converge in a very short number of operations to a visibly
non-optimal solution. (It looked similar to the initial random seed.)
The sum of times did not work well either. This is an image of the
solution to the previous city, where only the sum of times is considered.
All requests are given to the same driver.
We considered several different ways of improving this algorithm,
but did not finish these other ideas. One was another local search,
only in a representation of the city that made inventory constraints
easier to check. Right now the inventories are shared across several
paths based on their times so they are harder to represent. The solver
we used held them in a red-black tree, sorted by time. This other was
to maintain a list of optimal subpaths from depot to depot, without
assigning them a time. This representation seemed very promising.
Path relinking is also easy to implement for our local search, be-
46
cause an arbitrary metric can be used: even one that just compares
the similarity between two solutions.
6 Conclusion
Simulation is an important part of solving a modeling problem because
it gives the best guarantee of accuracy. Our team provided a simulator
that we believe simulates the most important aspects of the problem
Sam’s Hauling gave us. We did this early in the semester so that others
could use it, and quickly fixed any shortcomings we found. The code
also has a mathematical model that tries to describe what the code
does as accurately as possible.
When we completed the simulator we moved on to other interesting
questions. Although we were not able to find a difference in solution
robustness with respect to dynamic requests, we were able to find a
difference in solutions based on whether dumpster size constraints were
relaxed.
Through this process we also learned how to collaborate as a team,
and with the other teams to produce the best results. This project also
taught us how real world problem are very complicated, and usually do
not reduce to a text book problem description. However, we were able
to use what we know of well studied problems to build a simulator,
model, and solver for the problem Sam’s Hauling proposed.
Given more time we have plenty more ideas we could try. These
include writing a better solver, and a more thorough investigation of
how initial inventory distributions affect the route times.
47
A City Statistics
This table provides the percentages of operations, dumpster sizes, and
time windows.
10/06 10/07 10/08 10/09 10/10
D 28.3 D 43.1 D 37.3 D 34.7 D 41
P 66 P 48.3 P 49 P 42.9 P 52.4
R 5.7 R 8.6 R 13.7 R 22.4 R 6.6
6 28.3 6 27.6 6 25.5 6 22.4 6 26.2
9 35.8 9 32.8 9 37.2 9 18.4 9 29.5
12 20.8 12 22.4 12 31.4 12 40.8 12 21.3
16 15.1 16 17.2 16 5.9 16 18.4 16 23
AM 22.2 AM 10 AM 7.7 AM 10.7 AM 24.1
PM 0 PM 3.3 PM 0 PM 7.1 PM 10.4
OPEN 77.8 OPEN 86.7 OPEN 92.3 OPEN 82.2 OPEN 65.5
48
10/14 10/15 10/16 10/17
D 35.5 D 50 D 37 D 37.7
P 56.4 P 40 P 52 P 52.2
R 8.1 R 10 R 11 R 10.1
6 33.9 6 32.8 6 13 6 24.6
9 30.6 9 36.2 9 28.3 9 26.1
12 19.4 12 13.8 12 52.2 12 39.1
16 16.1 16 17.2 16 6.5 16 10.2
AM 11.1 AM 11.4 AM 18.2 AM 6.1
PM 0 PM 0 PM 0 PM 3
OPEN 88.9 OPEN 88.6 OPEN 81.8 OPEN 90.9
49

Más contenido relacionado

Similar a Final Report, Group 4

UIC Systems Engineering Report-signed
UIC Systems Engineering Report-signedUIC Systems Engineering Report-signed
UIC Systems Engineering Report-signedMichael Bailey
 
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...Nóra Szepes
 
My "Grain Motion Detection" Project
My "Grain Motion Detection" ProjectMy "Grain Motion Detection" Project
My "Grain Motion Detection" Projectsaveli4
 
An Integer Programming Representation for Data Center Power-Aware Management ...
An Integer Programming Representation for Data Center Power-Aware Management ...An Integer Programming Representation for Data Center Power-Aware Management ...
An Integer Programming Representation for Data Center Power-Aware Management ...Arinto Murdopo
 
Master_Thesis_Jiaqi_Liu
Master_Thesis_Jiaqi_LiuMaster_Thesis_Jiaqi_Liu
Master_Thesis_Jiaqi_LiuJiaqi Liu
 
Market microstructure simulator. Overview.
Market microstructure simulator. Overview.Market microstructure simulator. Overview.
Market microstructure simulator. Overview.Anton Kolotaev
 
Petr_Kalina_Thesis_1_sided_version
Petr_Kalina_Thesis_1_sided_versionPetr_Kalina_Thesis_1_sided_version
Petr_Kalina_Thesis_1_sided_versionPetr Kalina
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsAntonio Severien
 
Design patterns by example
Design patterns by exampleDesign patterns by example
Design patterns by exampleEric jack
 
Real-time monitoring and delay management of a transport information system
Real-time monitoring and delay management of a transport information systemReal-time monitoring and delay management of a transport information system
Real-time monitoring and delay management of a transport information systemLorenzo Sfarra
 

Similar a Final Report, Group 4 (20)

UIC Systems Engineering Report-signed
UIC Systems Engineering Report-signedUIC Systems Engineering Report-signed
UIC Systems Engineering Report-signed
 
diss
dissdiss
diss
 
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
 
Malab tutorial
Malab tutorialMalab tutorial
Malab tutorial
 
My "Grain Motion Detection" Project
My "Grain Motion Detection" ProjectMy "Grain Motion Detection" Project
My "Grain Motion Detection" Project
 
CS4099Report
CS4099ReportCS4099Report
CS4099Report
 
An Integer Programming Representation for Data Center Power-Aware Management ...
An Integer Programming Representation for Data Center Power-Aware Management ...An Integer Programming Representation for Data Center Power-Aware Management ...
An Integer Programming Representation for Data Center Power-Aware Management ...
 
Master_Thesis_Jiaqi_Liu
Master_Thesis_Jiaqi_LiuMaster_Thesis_Jiaqi_Liu
Master_Thesis_Jiaqi_Liu
 
Market microstructure simulator. Overview.
Market microstructure simulator. Overview.Market microstructure simulator. Overview.
Market microstructure simulator. Overview.
 
Petr_Kalina_Thesis_1_sided_version
Petr_Kalina_Thesis_1_sided_versionPetr_Kalina_Thesis_1_sided_version
Petr_Kalina_Thesis_1_sided_version
 
ep08_11
ep08_11ep08_11
ep08_11
 
Thesis_Report
Thesis_ReportThesis_Report
Thesis_Report
 
Thesis
ThesisThesis
Thesis
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 
Design patterns by example
Design patterns by exampleDesign patterns by example
Design patterns by example
 
MSc_Thesis
MSc_ThesisMSc_Thesis
MSc_Thesis
 
Knapp_Masterarbeit
Knapp_MasterarbeitKnapp_Masterarbeit
Knapp_Masterarbeit
 
Systems se
Systems seSystems se
Systems se
 
Live chat srs
Live chat srsLive chat srs
Live chat srs
 
Real-time monitoring and delay management of a transport information system
Real-time monitoring and delay management of a transport information systemReal-time monitoring and delay management of a transport information system
Real-time monitoring and delay management of a transport information system
 

Final Report, Group 4

  • 1. Group 4 report Trever Hallock Tamlyn Harley Matthew Stanley Cody Zhang December 11, 2014 Abstract We developed a MATLAB model that took in a city object and a solution matrix to output the efficiency of the solution based on multi- ple metrics. Also, we worked with the User Interface Team to translate between their Excel-based output and our MATLAB model. Finally, we worked to find and simulate more subtle solution characteristics than what were explored by the other teams. We produced a software package that was able to meet our goals and simulate the models given by the other teams involved. 1
  • 2. Contents 1 Introduction 5 1.1 Problem description . . . . . . . . . . . . . . . . . . . 5 1.2 Overview of our Role . . . . . . . . . . . . . . . . . . . 6 1.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 The Value of the Simulator . . . . . . . . . . . . . . . 7 2 MATLAB Model 8 2.1 MATLAB Object Model . . . . . . . . . . . . . . . . . 8 2.1.1 City Representation . . . . . . . . . . . . . . . 8 2.1.2 Solution Representation . . . . . . . . . . . . . 11 2.2 Route Simulation . . . . . . . . . . . . . . . . . . . . . 12 2.3 City Generation . . . . . . . . . . . . . . . . . . . . . . 14 2.4 City Translation . . . . . . . . . . . . . . . . . . . . . 14 3 Model 15 3.1 Model Assumptions . . . . . . . . . . . . . . . . . . . . 15 3.2 Model Decisions . . . . . . . . . . . . . . . . . . . . . . 16 3.2.1 Solutions Use Indices . . . . . . . . . . . . . . . 16 3.3 City Streets . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3.1 Modeling Language . . . . . . . . . . . . . . . . 17 3.3.2 Simulator output . . . . . . . . . . . . . . . . . 17 3.4 Model Variables . . . . . . . . . . . . . . . . . . . . . . 18 3.4.1 Parameters . . . . . . . . . . . . . . . . . . . . 20 3.4.2 The City . . . . . . . . . . . . . . . . . . . . . 21 3.5 Other thoughts . . . . . . . . . . . . . . . . . . . . . . 22 3.6 Decision Variables . . . . . . . . . . . . . . . . . . . . 23 2
  • 3. 3.7 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.7.1 Number of Requests Serviced . . . . . . . . . . 24 3.7.2 Time Taken to Service Requests . . . . . . . . 24 3.7.3 Distance Covered . . . . . . . . . . . . . . . . . 25 3.7.4 Landfill Fees . . . . . . . . . . . . . . . . . . . 25 3.7.5 Remaining Inventories . . . . . . . . . . . . . . 25 3.8 Constraints . . . . . . . . . . . . . . . . . . . . . . . . 26 3.8.1 Driver Routes must not overlap . . . . . . . . . 26 3.8.2 Time Windows . . . . . . . . . . . . . . . . . . 26 3.8.3 Sizes Match . . . . . . . . . . . . . . . . . . . . 26 3.8.4 Operations Follow Eachother . . . . . . . . . . 27 3.8.5 Constraints on Truck Types . . . . . . . . . . . 28 3.8.6 Staging Area Capacities Met . . . . . . . . . . 28 3.8.7 Trucks end where they Start . . . . . . . . . . . 28 3.9 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.9.1 City . . . . . . . . . . . . . . . . . . . . . . . . 29 3.9.2 Solution . . . . . . . . . . . . . . . . . . . . . . 31 3.9.3 Objective Values . . . . . . . . . . . . . . . . . 31 4 Simulator 32 4.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2 Failed Attempts and Lessons Learned . . . . . . . . . 32 4.2.1 Expressing Waiting . . . . . . . . . . . . . . . . 32 4.2.2 Matrix Dimension . . . . . . . . . . . . . . . . 33 4.2.3 Distribution of City Parameters . . . . . . . . . 33 4.3 Simulator Correctness . . . . . . . . . . . . . . . . . . 34 3
  • 4. 5 Robustness 34 5.1 Changing Request Locations . . . . . . . . . . . . . . . 35 5.1.1 Goal . . . . . . . . . . . . . . . . . . . . . . . . 35 5.1.2 The Process . . . . . . . . . . . . . . . . . . . . 35 5.1.3 Visual Robustness . . . . . . . . . . . . . . . . 36 5.1.4 Results and Interpretation . . . . . . . . . . . . 39 5.1.5 Other Thoughts . . . . . . . . . . . . . . . . . . 40 5.2 Flexible Dumpster Sizes . . . . . . . . . . . . . . . . . 40 5.3 Changing Times . . . . . . . . . . . . . . . . . . . . . 41 5.4 The Solver . . . . . . . . . . . . . . . . . . . . . . . . . 42 6 Conclusion 47 A City Statistics 48 4
  • 5. 1 Introduction Simulation is an integral part of any mathematical modeling, espe- cially when solving real-world problems. Without a standard process to simulate solutions, those designing and implementing models would be unable to verify their efficacy. Our modeling is for Sam’s Hauling, who requires an efficient method to schedule their drivers to service customer requests. Currently this is done by hand, and our job was to assist other teams in creating algorithms to accomplish this task by simulating their results. Thus, the Simulation Team’s goals were to produce a reasonably efficient way to test solutions given by the other teams who were developing math- ematical models to solve Sam’s Hauling’s delivery scheduling. We met our goals on time and created a software package that is not only robust but is capable of simulating many different situations and occurrences. 1.1 Problem description Sam’s Hauling Inc. provides a small to medium sized dumpster rental service to customers in the Denver Metro Area. They first deliver these dumpsters, or containers, to customers who fill them. Then Sam’s Hauling must return to collect the full dumpsters and take them to a dump for disposal. Therefore, drivers must constantly be delivering new containers, picking up full containers, and dropping off trash at dumps. There are also staging areas where drivers begin their day, and where empty dumpsters are stored. These have capacities and initial inventories that must also be considered. Currently, they schedule their drivers by hand, and are looking for a new method to optimally 5
  • 6. schedule their pickup and delivery routes. 1.2 Overview of our Role We created three pieces of software, using MATLAB, for other groups to use in testing their solutions: • A city generator that creates random stops, dumps, and staging areas with which others can experiment • A simulation function that simulates how well a solution satis- fies several of these requests, and output its efficiency based on various metrics • A translate function that takes data from the User Interface team and transforms it into MATLAB These three pieces of software have been used by the other teams in testing their proposed solutions. Our goal was to provide an objective means by which others can efficiently test their proposed solutions, compare feasibility and optimality of said solutions, and give a good estimate of the sensitivity of each solution to changes. We believe that we have met our goals. 1.3 Goals As a team, our objective was to create a software package that could simulate routes through cities to output their efficiency using vari- ous metrics. We did not set out with the goal of stating our opinion on whether these metrics were “good" or “bad," but instead simply 6
  • 7. reported the data to those using the software and allowed them the freedom to interpret the results independently. We also had the objective of making our software user friendly; part of this was to allow easy translation between Excel (what the User Interface team ended up using) and MATLAB (our chosen plat- form). This was a necessary goal in bridging the gap between the two platforms, and without this the teams would have to manually trans- late the information - which would be hardly “user friendly." In the same vein as user friendliness, we also set out with the goal of creating a tutorial / manual for the class to use so that they could better un- derstand the features and capabilities of our simulator. This tutorial was completed and presented to the class as a whole, and was available for download along with our simulator. 1.4 The Value of the Simulator Our software simulates the existing vehicle routing problems, as well as taking inventory constraints on the containers in each staging area into account. This is a critical part of the solution because without the simulator, proposed solutions would not be as rigorously tested. For example, the simulator ensures that a solution satisfies inventory constraints through out the day, not just the ending inventories which is what several other teams modelled. A working simulator was integral in allowing the others teams to test their proposed solutions and data to measure their efficiency and accuracy; without the simulator, this would have had to be done by hand. 7
  • 8. Our plan was to create a working software package as early as possible for every group to begin using. A working version of the simulator was provided to the class on 30 October 2014, giving the class more than a month to work with the software to test their solutions. Along with this, we also presented our tutorial. We also used the simulator to gain a deeper understanding of the problem. We tested different assumptions made in the class, like con- stant drive times; and tested solution robustness to changes in requests. Finally, we created an iterative solution method that attempts to solve the entire problem. 2 MATLAB Model There are many parts to the model that we have created, as detailed below. 2.1 MATLAB Object Model 2.1.1 City Representation Cities are objects that store multiple pieces of information. The city stores all of the data about drive times, stops, customer requests, land- fills, staging areas, trucks, etc. It is an entire set of data that makes up the problem statement. Actions A city contains actions, also called stops. There are mul- tiple actions for any city. An action is simply something a driver can do. An action has several main parts: 8
  • 9. The Operation This can be P (pickup), D (dropoff), R (replace), S (stage), U (unstage), or E (empty). The In-size This is the size of the dumpster that is being brought to the action or stop, as a numerical value between of 0, 6, 9, 12, or 16. The 0 represents no dumpster and the others represent the four sizes of containers we are dealing with. The Out-size This is the same thing as in-size, but it is what size of dumpster the driver is supposed to leave with. The Start-time This is the time, in seconds, when this action can start being performed; for example, if the start-time is 10,000, this means that 10,000 seconds must elapse during the simulation before this action can be performed. The Stop-time This is the time, in seconds, when an action must be performed by; together with start-time, these model time window constraints. The Wait-time This is the amount of time it takes to actually complete the action (note that this is different from the travel time it takes to get to the location, which is modeled in a different place). The Location This is where the action is actually located; it is an index into the array of locations that we have, telling us which location has this 9
  • 10. particular action. It is also an index into the matrix of distances and durations between locations. Example Different locations have different kinds of actions. For example, staging areas have 8 actions each, as shown in this table: STAGING AREA Operation In-size Out-size ACTION 1 STAGE 6 0 ACTION 2 STAGE 9 0 ACTION 3 STAGE 12 0 ACTION 4 STAGE 16 0 ACTION 5 UNSTAGE 0 6 ACTION 6 UNSTAGE 0 9 ACTION 7 UNSTAGE 0 12 ACTION 8 UNSTAGE 0 16 If you want to unstage (pickup) a size 12 dumpster, you must start with an empty truck and leave with a size 12 dumpster. Thus, you would use the seventh action, which means your in-size is 0 and your out-size is 12. Having multiple actions is needed because only giving a location is ambiguous when a truck travels between staging areas. Similarly, landfills have four actions each, one for each kind of dumpster you will be bringing there to empty. Customer requests only have one action each, as they will always have a predetermined in-size and out-size. Now that we have covered cities, let us look at solutions. 10
  • 11. 2.1.2 Solution Representation The user provides a solution represented by a matrix. Each row repre- sents a driver’s route and each corresponding column entry a driver’s row is what that driver will do in order. A negative one means that the driver does nothing: he is at the end of his route. The following is a solution matrix.      2 4 19 22 −1 −1 −1 10 17 19 44 11 13 5 6 19 −1 −1 −1 −1 −1      Each row is a driver; thus, this city has three drivers. Driver one performs action 2, then 4, then 19, then 22, and then he is done for the day. Similarly, driver three performs actions 6 and 19, in that order, then finishes his day. Driver two is the busiest, obviously, but his actions are just as easily read. As you can see, it was necessary to encode what kinds of dumpsters the driver was dropping off, picking up, etc. with the in-size and out-size, so that this matrix would be feasible. This encoding of the solution into a matrix is why there are so many actions associated with staging areas and landfills. As an example, imagine that stop 19 is a staging area. Because of how the actions are structured, given any city, we will know exactly what kind of action the driver is performing at that staging area. Actions 18-25 may all be at that same staging area, but they all mean different things. 11
  • 12. 2.2 Route Simulation The function to simulate a given city has the following signature: function [feasible, times, distances, number_serviced, fees, inventories] = simulate(c, sol, v, checkall) Parameters c The city object sol The solution matrix v (Optional) True if errors should be printed checkall (Optional) True if all constraints should be checked, even when one is vio- lated Return Values The simulate function takes in a city and a solution matrix as arguments, and outputs these metrics: feasible False if the solution encountered an error that makes it a non- viable solution; it returns true otherwise times A vector containing all of the times it took each driver to com- plete his assigned route, based on the solution matrix given 12
  • 13. distances A vector containing all of the distances traveled by each driver based on his assigned route number_serviced How many of your customer requests were actually completed fees How much in fees you accrued on this route, based on the costs of landfills inventories How many dumpsters remain at each staging area at the end of the day Some solutions will contain errors that the simulator will recognize as non-feasible solutions. It will return the variable feasible = false if this is the case. Many things will cause it to return infeasible. For example, if a driver visits a landfill followed by another landfill, this makes no sense and it will return infeasible. Or, if a driver visits a staging area to get a size 6 dumpster, but that staging area has none in inventory, this will also return infeasible. If you want a detailed list of all the constraints refer to the MATLAB/src directory, or to the model in the next section. Unless instructed otherwise, the simulator will continue to run to the best of its ability when dealing with a non-viable solution. The errors it finds will be displayed in the main window of MATLAB. Our suite of functions also comes with a generate_rand_solution function that takes a city as an argument. However, beware, this random solution is rarely viable. 13
  • 14. 2.3 City Generation function [c] = generate_city(R, L, Y, D) The function generate_city(R, L, Y, D) generates a city at random, based on these arguments: Parameters R The number of customer requests to be generated L The number of landfills to be generated Y The number of staging areas, or yards, to be generated D The number of drivers that work in that city Return Value c A randomly generated city This function creates a random set of customer requests, landfills, staging areas, trucks, etc. for us to run a simulation on. Everything is random, so that you can create a diverse set of cities to run simulations through. 2.4 City Translation function [c] = translate(dirname) This function takes the output from the user interface team (who are working with Excel) and translates it into our city structure in MATLAB. It also finds the coordinates of the addresses, to plot the city. 14
  • 15. For example, sample data given by the UI team on Canvas is also in our repository under /MATLAB/test/example_ui_data. If you want a city based on this data, you can simply enter the MATLAB com- mand c = translate(’test/example_ui_data’); This city will represent the data found in this city. Translate expects to see five files in the directory as given: output1.txt, output2.txt, output3.txt, output4.txt, and output5.txt. Because this uses a Google API to find a location from the address the UI team gives, you must be connected to the internet to run this. 3 Model The MATLAB code can also be expressed mathematically. We will describe the mathematical model that this simulator attempts to sim- ulate. This includes detailing some of the assumptions that went into this model along with listing the parameters and variables for the model. Finally, we will list the constraints and objectives. 3.1 Model Assumptions Several assumptions were made to simplify the model. For exam- ple, our model only takes trucks into account without worrying about drivers. We see that drivers could be incorporated into the model by fixing a number of drivers and only letting this many trucks be anywhere but the start location at any given time. Another assumption that went into our model is that when a cus- tomer requests a certain size dumpster, the only feasible solutions are to give that customer exactly the dumpster that was requested. This 15
  • 16. is more strict than the problem statement in which a large dumpster can be provided. We provide arbitrary time windows that are more general than needed, because we only needed to have AM, PM, and OPEN time windows. However we chose to include them in the model because Sam’s Hauling mentioned that some customers do request pickups or deliveries in specific time windows. Also, drive times and wait times are assumed to be constant through- out the day. This may or may not be a reasonable assumption. We were not sure if we should include actions to be performed at staging areas that allowed drivers to drop-off a container and pickup a container in the same action. We decided against this because it greatly increases the number of actions. Instead, solutions are required to first drop-off a dumpster and then pickup the next one up. 3.2 Model Decisions 3.2.1 Solutions Use Indices At first, we wanted to test constraints by having the solution be a ma- trix that includes all of the request information like the location, oper- ation, and time windows. After discussions with the group, however, we figured out another way to represent all of the solution information by only using indices into the City object. This removes duplicate data between the city and solution, so that it is harder to give a solution matrix that is inconsistent with the city. 16
  • 17. 3.3 City Streets Originally, there was also an idea of coding every city street into the city objects, so that there were multiple ways to reach every desti- nation. This would simulate a real city’s data. After analyzing this method we determined that each driver would simply choose the short- est path if given the option, so we would just have the shortest path connecting every location. This ended up simplifying the problem, and allowed the simulator better performance. 3.3.1 Modeling Language We chose to create the simulator in MATLAB because this would make it easiest for other teams to use. Although it may not have been the most efficient language, we could count on other teams being able to use it. We had several concerns about using MATLAB, like whether it supports objects. In the end, it turned out to be fairly simple to use. 3.3.2 Simulator output We tried to return output that could be used in several different ways. For example, we returned the times of each driver, rather than sum- ming them or taking the maximum time of each driver. This means that other other teams could combine this information as they wished. For example, consider the following two solutions. The numbers like “0 ->6" represent that the request at that location is a pick up of a size 6 dumpster. Different colors represent different drivers. 17
  • 18. Our solver generated these two solutions s1 and s2 with the follow- ing drive times: times for s1: 26259 26927 26691 25757 26812 26035 26599 26859 26848 times for s2: 0 0 0 63056 0 0 24699 0 125832 41048 The sum of times for the second is 254635 which is less than the sum for the first of 265210. However, the maximum time is 26927 for the second while it is 125832. It is not clear how to combine these different objectives, and we did not want to decide what other teams should maximize. 3.4 Model Variables First, we will define the variables used in the model. While giving the symbols and their dimensions, we will try to use the following convention for indices. For convenience, unless otherwise noted, i will range from 1 to n, j will range from 1 to m, k will range from 1 to Y , d will range from 1 to D, t will range from 1 to |T|, and l will be 18
  • 19. another index. All times are measured in seconds. 19
  • 20. 3.4.1 Parameters Variable name Description L L ≥ 0 # of Landfills Y Y ≥ 0 # of Staging Areas (or Yards) R R ≥ 0 # of customer requests n n ≥ 0 # of actions, or stops m 0 ≤ m ≤ n # of unique locations D D ≥ 0 # number of trucks (or Drivers) S For Sam’s Hauling, |S| = 5 Set of dumpster sizes For Sam’s Hauling, we will let S = { ‘6’ ‘9’ ‘12’ ‘16’ ‘No Dumpster’ } T For Sam’s Hauling, |T| = 3 Set of possible Truck types For Sam’s Hauling, we will let T = { ‘small’ ‘medium’ ‘large’ } 20
  • 21. Variable name Description O |O| = 6 Set of operations O ={ ‘D’: deliver a dumpster ‘P’: pickup a dumpster ‘R’: replace a dumpster with a different one ‘E’: throw away a dumpster at a landfill ‘S’: Stage a dumpster ‘U’: Unstage a dumpster } 3.4.2 The City Variable name Description I 1 ≤ I ≤ m, I ∈ {sk}Y k=1 The starting index of all trucks. For each stop (Tbegin i , Tend i ) 1 ≤ i ≤ n, 0 ≤ Tbegin i < Tend i Time windows when stop i is possible Wi 1 ≤ i ≤ n The wait time required to visit stop i oi 1 ≤ i ≤ n Operation to be performed at stop i (Sin i , Sout i ) 1 ≤ i ≤ n The in/out dumpster sizes of each action. If oi =‘S’, then Sout i = No Dumpster li 1 ≤ i ≤ n, 1 ≤ li ≤ m The locations associated with each stop ci,t 1 ≤ i ≤ n, 1 ≤ t ≤ |T| Constraints on truck size ci,t ∈ {0, 1} 1 if action i is accessible by truck type t For example, we would like to set ci,t = 0 when oi = R and Sout i = 16 21
  • 22. Variable name Description For each location tj,l 1 ≤ j, l ≤ m, di,j ≥ 0 Time to get from location j to l dj,l 1 ≤ j, l ≤ m, fi,j ≥ 0 Distance, between location j to l For each truck td 1 ≤ d ≤ D, ti ∈ T Truck type of truck d For each staging area Ik,s 1 ≤ k ≤ Y, s ∈ S {‘No Dumpster’} Initial # of dumpsters At the beginning of the day. Ck 1 ≤ k ≤ Y, Ck ≥ 0 Max capacity staging area k Obviously, j∈S Ik,j ≤ Ck sk 1 ≤ k ≤ Y, 1 ≤ si ≤ m location of staging area k For each Land fill Fl 1 ≤ l ≤ L The fee associated with landfill l el 1 ≤ el ≤ m, 1 ≤ l ≤ L The location of landfill l 3.5 Other thoughts In MATLAB, the parameters L, Y , R, D are required to generate a random city in addition to several distribution parameters. A ‘city’ 22
  • 23. will be encapsulated by the rest. There are R different actions to represent the requests, because there is only one action associated to each. Each landfill has a different action for each dumpster size: one action where the in and out size is that dumpster size. Each staging area has actions allowing a truck to drop off each size of dumpster: the actions have ‘No Dumpster’ for the in size. Finally, there is one action for picking up each dumpster size. That is, the total number of actions is n = R + (L + 2Y )(|S| − 1). 3.6 Decision Variables The solution to be simulated is given by the user. It is represented by a matrix with D rows and an arbitrary number of columns. Each row xd, 1 ≤ d ≤ D will be a permutation vector of the stops to be performed by driver d (followed by −1’s). Let us name their lengths rd := length(xd) ≤ n. We interpret the l-th element of xd (which is denoted xd,l) as the l-th stop to be performed by truck d. For example, if oxd,l = ‘S’, then the l-th stop by driver d is a staging operation at a storage yard. 3.7 Objectives Let us make some of the following equations simpler with these defini- tions. Let t(x, y) = tx,y, so we have fewer subscripts. For convenience of later sections, let us define a function a(d, k) that represents the accrued time that truck d takes to complete its k-th stop. (We have 23
  • 24. assumed that 1 ≤ d ≤ D, 1 ≤ k ≤ ri.) This is given by: a(d, l) := t(I, xd,1) + l j=1 Wxd,j + l−1 j=1 t(lxd,j , lxd,j+1 ). That is, the time from the start location plus the times to travel be- tween the stops, plus the time at each stop. We will simulate this as a multi-objective problem. We include the number of requests serviced, the time taken to do so, the total distance covered, and the amount of fees accrued. We could use a lexicographic ordering to order these (If we have a maximum time). 3.7.1 Number of Requests Serviced The total number of requests serviced is N = D d=1 rd j=1 oj∈{‘D’,‘P’,‘R’} 1. 3.7.2 Time Taken to Service Requests One way to model the total time could be Ttotal = maxD d=1{a(d, rd)} which is the time of the longest route and would represent the amount of time before all routes were completed. Another is to represent the total number of man-hours spent, which would instead be a sum of all times: Ttotal = D d=1 a(d, rd). It would be better to simply calculate the difference at each stop and its time window, and try to minimize the errors (or customer wait 24
  • 25. times/early inconveniences). What the simulator does is return an entire vector of time-costs associated with each driver. 3.7.3 Distance Covered The total distance driven by all drivers is given by D d=1 d(I, xd,1) + l−1 j=1 d(lxd,j , lxd,j+1 ). 3.7.4 Landfill Fees The fees accrued by all drivers is given by D d=1 rd j=1 { 0 if oxd,j = ‘E’ Fl where el = lxd,j 3.7.5 Remaining Inventories The inventories remaining at the end of the day can be used to ensure that dumpsters are equally (or otherwise) spread out among staging areas. This could be used to ensure that dumpsters at staging areas are accessible for the next day. Also, it is mentioned in the prob- lem description that some staging areas might not be allowed to have dumpsters overnight. Although we would need another parameter in the model to dictate which staging areas these are, that would likely make this objective a constraint. 25
  • 26. 3.8 Constraints 3.8.1 Driver Routes must not overlap We don’t want to visit the same request twice, so for all 1 ≤ d, d ≤ D and all 1 ≤ j, j ≤ ri, at least one of the following 4 statements must be true: 1: d = d and j = j 2: oxd,j ∈ {E, S, U} 3: oxd ,j ∈ {E, S, U} 4: xd,j = xd ,j This means that requests cannot be serviced by multiple drivers or twice by the same driver. 3.8.2 Time Windows For each 1 ≤ i ≤ D and each 1 ≤ k ≤ ri we need Tbegin xi,k ≤ a(i, k) ≤ Tend xi,k . If we include the Tmax variable, we will need to ensure that a(d, rd) ≤ Tmax for all 1 ≤ d ≤ D 3.8.3 Sizes Match For all 1 ≤ d ≤ D, and for all 1 ≤ j ≤ rd − 1, we have Sout xd,j = Sin xd,j+1 . For all 1 ≤ d ≤ D, we have Sin xd,1 = ‘No Dumpster’ as an initial constraint. 26
  • 27. These mean that if a truck leaves a stop with a 9 dumpster, then he arrives at the next location with a size 9 dumpster. We also assume that all trucks start out with no dumpsters. In the problem description, there is a statement that dumpsters larger than the one being requested can also be used. We have not incorporated this into our model. 3.8.4 Operations Follow Eachother A driver cannot service two pickup requests in a row without visiting a landfill. This means there is a constraint on the which actions can follow each other. We can tell if an action is allowed to follow another based on the actions’ operations. For all 1 ≤ d ≤ D, and for all 1 ≤ j ≤ ri − 1, we need that follows(od,j, od,j+1) is true, where the follows predicate has the fol- lowing truth table. Read this table as operation in row d can follow the operation in column j if follows(i, j) = T. follows ‘D’ ‘P’ ‘R’ ‘T’ ‘S’ ‘U’ ‘D’ F F F T F T ‘P’ T F F F T F ‘R’ F F F T F T ‘T’ F T T F F F ‘S’ F F F T F T ‘U’ T F F F T F Another way of modeling this constraint would be to have a dump- ster state of full, or empty (which is not the same as no dumpster) for each stop/action. Then we would just need a ‘truck state matches’ constraint, which would be as simple as the ‘sizes match’ constraint. 27
  • 28. 3.8.5 Constraints on Truck Types We need a constraint that says a truck cannot service a request that requires a different type of truck. For each 1 ≤ d ≤ D and each 1 ≤ j ≤ rd we need cxd,j,td = 1. 3.8.6 Staging Area Capacities Met Let b(d, t) (b for bound) be the largest index k, 1 ≤ k ≤ ri such that a(d, k) < t. In other words, this is the a index into the d-th truck’s route that gives its last stop before time t (t ∈ R, 0 ≤ t < ∞). Then, we want that for each 1 ≤ y ≤ Y , and each s ∈ S {‘No Dumpster’}, and each t ∈ R, 0 ≤ t < ∞, 0 ≤ Iy,s + D d=1 b(d,t) j=1 oxd,j =‘U’ Sout xd,j =s sy=lxd,j (−1) + D d=1 b(d,t) j=1 oxd,j =‘S’ Sin xd,j =s sy=lxd,j (1) ≤ Ci. This assumes that each dumpster takes up the same amount of space in the staging area. If not, we could use a weighted sum, where we replace the ±1 with a weight based on s (the coefficients would need to be another parameter to the model). 3.8.7 Trucks end where they Start There is a constraint that says each truck must stop at the staging area it starts at. That is, for all 1 ≤ d ≤ D, we have lxd,rd = I. 28
  • 29. 3.9 Example This is an example city with L = 1, Y = 1, R = 2, D = 2, S = {9 , 12 , 16 , No Dumpster}, T = {small, large}, n = 11, m = 4. 3.9.1 City The set of possible stops in a stop couldbe given by the following table: index window begin window end wait time operation 1 0 100 5 ‘T’ 2 0 100 5 ‘T’ 3 0 100 5 ‘T’ 4 0 100 2 ‘U’ 5 0 100 2 ‘U’ 6 0 100 2 ‘U’ 7 0 100 2 ‘S’ 8 0 100 2 ‘S’ 9 0 100 2 ‘S’ 10 0 50 1 ‘R’ 11 0 100 1 ‘D’ 29
  • 30. index in size out size location 1 9 9 1 2 12 12 1 3 16 16 1 4 No Dumpster 9 2 5 No Dumpster 12 2 6 No Dumpster 16 2 7 9 No Dumpster 2 8 12 No Dumpster 2 9 16 No Dumpster 2 10 9 16 3 11 16 No Dumpster 4 The following table might describe truck types: index starting location type 1 2 small 2 2 large Further suppose that the distances are given by: 0 1 3 2 1 0 2 4 3 .5 0 1 2 1 3 0 The truck constraints might be given by: index small large 1 1 1 2 1 1 3 0 1 4 1 1 Finally, there is only one staging area: 30
  • 31. index capacity location 0 10 1 This single staging area will be the initial location: I = 6. 3.9.2 Solution The following two are possible solutions, although the second does not end where it starts. 1. x1 = (), x2 = () (In this case, r1 = r2 = 0). 2. x1 = (), x2 = (6, 11, 4, 7) (In this case, r1 = 0, r2 = 4). However, this solution is not feasible, because the sizes don’t match in the last stop of the second truck: x1 = (), x2 = (6, 11, 4, 8) (In this case, r1 = 0,r2 = 4). 3.9.3 Objective Values The time for the first is 0, which is also the number of requests serviced. The number of requests serviced in the second is 1, and the time for the second truck a(2, 4) is: dl6,l6 + W6 + dl6,l11 + W11 + dl11,l4 + W4 + dl4,l7 + W7 = d2,2 + W6 + d2,4 + W11 + d4,2 + W4 + d2,2 + W7 = 0 + 2 + 4 + 1 + 1 + 2 + 0 + 2 = 12. Thus, Ttotal = 12 as well. 31
  • 32. 4 Simulator 4.1 Performance Currently, the two most expensive functions of the simulator are the check that inventory bounds are satisfied and a helper function that computes the times each request is performed. Which of these methods is most time consuming depends on how close to valid the solution is. MATLAB provided a summary how much time was spent in each method. We found that when the solution is not close to valid and the simulator does not need to check all constraints, the most expensive method was the one used to find the time when each request was serviced. For example, out of several runs, this took 2.5 seconds out of 7 seconds total. When all constraints had to be checked, we found that checking the inventories was the most expensive. This method took 6 seconds out of 16 seconds. If we were to optimize our simulator further, these would likely be where we could find the most speedup. 4.2 Failed Attempts and Lessons Learned 4.2.1 Expressing Waiting We initially created the simulator without the ability to have a driver wait without performing any action. This caused a problem because drivers were not able to wait for a time window to begin, and meant that some cities had no solutions that we could represent with our model. We fixed this by automatically forcing drivers to wait for the beginning of the time window of each request. For example, suppose a 32
  • 33. driver arrives at a request at 9:00am that has a time window from noon to 6:00 pm. The simulator automatically forces the driver to wait until noon before moving on. This was mainly a problem for small cities. 4.2.2 Matrix Dimension Some drivers did not have the same number of actions assigned to them, even if they took the same amount of time or longer than a driver with more actions. We learned that less stops did not necessarily mean less productivity or efficiency. When first creating a solution matrix, our assumption was that the matrix would always be size DxN, where D was the number of drivers in the city and N was the number of actions possible in the city. This turned out to be false because drivers could repeat actions, meaning that the matrix dimensions could not be determined ahead of time. 4.2.3 Distribution of City Parameters Our simulator tries to replicate real life cities and routes in which a driver or multiple drivers drops off and picks up dumpsters; therefore, we have tried to make this as close to life as possible, while keeping it simple enough to use. It is our belief then that this corresponds directly to any real world problem that Sam’s Hauling might encounter, and any variance to the real world outcome is an error in the software but not an error in the design or theory of our project. Originally, the simulator did not generate realistic cities. For exam- ple, the distribution of wait times did not match example data from Sam’s Hauling. With sample data, we modified the generate_city function so that the cities better represented real data. 33
  • 34. One way of doing this is to randomly select locations from the sample data, and construct cities from this. One disadvantage of this is the distances between all stops must be computed. An alternative way is to first simply come up with distributions of several important characteristics of the sample data, like the distribution of drive times. Then, more drive times can be generated that follow the same mean, variance, or possibly skew of these distributions. This is the way we chose, so we found the distribution of dump- ster sizes, operations, and time windows. These are contained in the appendix. Other important statistics would have been the number of truck type constraints and drive times. 4.3 Simulator Correctness While writing the simulator we tested several aspects to make sure that the code was correct. We also included around 20 tests that ran predetermined cities through the simulator to ensure the simulator was accurate as the code changed. 5 Robustness There are more subtle ways that the simulator may be able to capture a solution’s quality. Several of these measures revolve around the idea of robustness. 34
  • 35. 5.1 Changing Request Locations 5.1.1 Goal Because customers call in and change their requests through out the day, we were interested in figuring out if some solutions handle changes better than others. In order to answer this, we first created a way to compare how robust two solutions are with respect to request changes. Then we created several random cities and used this comparison to see if there were drastic differences in robustness between two good solutions. If there are big differences in the robustness of several good solutions, then we would have concluded that a good algorithm must not only have an efficient solution, but that solution must also be robust. 5.1.2 The Process For our considerations, we defined the difference in robustness of two solutions s1 and s2 according to the following process. 1. Pick a random time t between 0 and the minimum time it takes either solution to finish. 2. Create a new city from the original city by removing all the re- quests that have already been serviced by s1 before time t. Also, update the starting locations for each of these drivers, so that each driver starts where it would have been if it were at time t in s1 for the original city. Create the corresponding sub-city for s2 at time t. 3. Randomly change the location of 1 4 (an arbitrary percentage) of 35
  • 36. the remaining requests for each of these two sub cities (with a uniform distribution). 4. Find optimal solutions for the sub-cities, and compare the times for these optimal solutions. In the previous process, steps two and three are called creating a sub-city problem from the original problem. Of course, this process requires that a city has multiple start locations that don’t have to be at a staging areas. Also, now trucks can start with a full dumpster. It is worth noting that the remaining requests will likely be different for these two cities (unless s1 = s2), so different requests are changed in each sub-city. This observation also brings out that the faster solution will likely have fewer remaining requests. This gives it an advantage, although in general we noticed several cases where the slower solution did provide faster subcity solutions. An alternative could be to simply select a subset of requests from the original city to be changed, and then not change the requests in the subcities that are already serviced. 5.1.3 Visual Robustness We can see how this process works with some pictures. It is easier to see differences in the pictures with fewer requests, so we choose a city with 28 requests and 3 drivers. This is the original city, where a start_X is the start location for driver X and stop_X is his end location. 36
  • 37. Two alternate solutions are given here. They are both local mini- mum with respect to a very primitive local neighbor search described later. 37
  • 38. Next we found both maximum times of each solution’s drivers, and chose a time between 0 and the minimum of either of these times. After cutting the first solution at this time, we applied mutations to the first sub-city to get the city on the right from the original sub-city on the left. Finally, we have the solutions to the mutated sub-cities. 38
  • 39. The goal is to create these last two images several times, and see if the subcity solutions for one solution are usually better than the subcity solutions for the other. 5.1.4 Results and Interpretation Initially we ran this test for several different initial cities, generating the two alternative solutions to be compared each time. However, we think it is more natural to take a particular city and construct several different subcities for the same city, only generating the two alternatives once. The test we performed compared the average of the two time dis- tributions of subcity solutions. We performed the Welch’s t-Test to compare these two means. The two sub-city times were normalized by the time that it took the original solutions to solve the remaining sub- cities. After 20 sub-city solutions for an initial city with 100 requests and 5 depots, we were not able to find a difference in the means. The Welch t-Test gave a statistic of t = 0.0193 with 37.656 degrees of free- dom. This gave a p-value of p = 0.9847, so we cannot conclude that 39
  • 40. either of the two initial solutions provided subcities whose solutions were faster than the other on average. 5.1.5 Other Thoughts We also applied a Kolmogorov-Smirnov test. This test doesn’t just compare the means, but also checks if the two distributions are dif- ferent in any way. This did find a significant difference, which means that there may be other differences aside from the mean that are worth considering. There may also be more differences in solutions when so- lutions are highly constrained by inventories (cities with small initial inventories). Another test may be to see how much of a difference the ending inventories have on the next day. There is one problem with this test. If the driver is waiting at a landfill, with say 20 minutes left after the cutting time, he no longer has to wait that 20 minutes in the sub-city. That is the remaining wait times at landfills are cut off when we construct the sub-city. We hope that this does not have a strong impact on the tests results. Of course, we are comparing random good solutions, without trying to find robust solutions. The differences might be bigger if we could actually search for robust solutions instead of any two good solutions with respect to another metric. It could be better to do this test several times and take the maximum of differences in robustness. 5.2 Flexible Dumpster Sizes One fairly simple modification to the simulator allows flexible dump- ster sizes. We expanded our model to include multiple actions per 40
  • 41. request. The different actions have different sizes, so that solutions are allowed to deliver a size 9 dumpster when the request was for a size 6 dumpster. Then the constraints on requests not being serviced multiple times were updated to check the request number instead of the action number. We performed another Welch t-test to compare the mean time of solutions for a city where the dumpster size con- straints have to be met, and the mean time of solutions with this relaxed constraint. After solving 100 ramdomly generated cities, we found a difference in the mean with a p-value of p = 0.01475. Because these solutions are faster, if this is allowed by Sam’s Haul- ing it would be a good idea to include this in the algorithm. We may add penalties to this based on the extra cost of dumping larger dump- sters. How important this is to solution times obviously depends on what the distribution of dumpster size requests are in the first place. 5.3 Changing Times One of the assumptions made in the class was that drive times do not depend on the time of day. We collected drive times through different times of the day, and hoped to test if this assumption is valid or if simulation output changes drastically when we include dynamic drive times. At first, we hoped to include one test done in the same way as changing the request locations, only rather than simulating a solution part way through, we just see whether some solutions work better with realistic, changing drive times. However, it seems like the solutions are simply scaled to take longer based on the variance in times through 41
  • 42. out the day, without some being scaled more than others. 5.4 The Solver The solver we used to conduct the previous tests used a simple local search. An initial solution was generated randomly in the following way. Each truck started at the depot and then the route that takes the least amount of time is given another request to service until all requests are serviced. The next request is chosen with a decreasing probably based on how long it would take the truck to get to the next request. (First the time to get to all other requests is found, and then the closest is chosen 1 2 of the time, the second closest is chosen 1 4 of the time and so on.) Then, once an initial solution is found, we start applying opera- tions to that solution as long as these operations improve the solution. The operations we applied were to exchange any two sub-paths of two drivers. For example, we can represent a solution as a list of routes, each row being the indices of the actions each driver does. One solution might be the following: 45 25 14 13 53 34 14 1 15 76 24 11 43 78 This would mean that the first driver performed action 45, then 25, then 14 and so on. Now, we can select a sub-path of each driver. Suppose we selected the following bracketed sub-paths: 45 [25 14 13 53] 34 14 1 15 76 [24 11 43] 78 42
  • 43. We can then exchange these to find the following solution: 45 [24 11 43] 34 14 1 15 76 [25 14 13 53] 78 Now, this solution is most likely not feasible. Initially, we solved this problem by only exchanging sub-paths that had the same dumpster state of the truck. The dumpster state of a truck is defined as the pair consisting of the dumpster size and whether the truck is full. So any sub-path that begins with an empty truck state and ends with a full truck of size 6 can be exchanged with any other such sub-path. However, we found that this was too limiting, because some of the solutions generated had sub-paths that were visibly poor. To expand the ways we can change a solution we started by only considered sub-paths that began and ended with a request. Then, when we exchanged sub-paths, we over wrote all the actions that took place at staging areas and landfills before and after the request. So, in our previous example, suppose that actions 45, 34, 15, 76, and 78 were at landfills or staging areas. The curly-bracketed actions are the actions before and after our sub-paths that occur at landfills or staging areas. The square-bracketed actions are sub-paths that begin and end with requests. An X means we temporarily ignore the action in that location. Then we could perform the following exchange: select sub-paths: {45} [25 14 13 53]{34} 14 1 {15 76}[24 11 43] {78} only consider requests 43
  • 44. {X} [25 14 13 53] {X} 14 1 {X}[24 11 43] {X} swap {X} [24 11 43] {X} 14 1 {X} [25 14 13 53] {X} fill in gaps 38 [24 11 43] 54 14 1 17 [25 14 13 53] 32 The 38, 54, 17, and 32 are new actions at staging areas or landfills that are used to fill in the gaps between requests. They are chosen so that the time between requests is minimized and such that the solution is once again feasible. Sub-paths can have 0 length, so that we can “cut" a sub-path from one route and “insert" it into another. There is no need to make the sub-paths being exchanged have the same length, because this should be handled by the objective function. Also, the length of the sub-path does not determine the time it takes to complete the sub-path. The output consists of the operations applied to the initial solution to obtain the local minimum: ... exchange [driver:5,begin=2,end=10] with [driver:6,begin= 0,end=11] exchange [driver:6,begin=9,end=14] with [driver:7,begin= 6,end=11] exchange [driver:0,begin=1,end= 2] with [driver:1,begin=15,end=17] ... 44
  • 45. Pictorially, one operation could take the first solution to the second. In this case a sub-path of the light green was exchanged with a sub- path the red. With around 70 requests, 4 depots and 8 drivers (like the one shown) this process could converge in around 300 operations. Then this process of picking a random city, and apply a local search to it was repeated several times. A visual representation of how well this worked is that it would take random cities like the one on the left, and provide ones on the right. We maximized with respect to the number of drivers squared time the maximum time of any driver plus the the of times of each driver. 45
  • 46. This meant that the most important feature was the maximum time of any driver, but all ties were broken by the sum of total times. Considering an overtime metric did not work as well, because it limited how many operations could be done to improve the solution. It would typically converge in a very short number of operations to a visibly non-optimal solution. (It looked similar to the initial random seed.) The sum of times did not work well either. This is an image of the solution to the previous city, where only the sum of times is considered. All requests are given to the same driver. We considered several different ways of improving this algorithm, but did not finish these other ideas. One was another local search, only in a representation of the city that made inventory constraints easier to check. Right now the inventories are shared across several paths based on their times so they are harder to represent. The solver we used held them in a red-black tree, sorted by time. This other was to maintain a list of optimal subpaths from depot to depot, without assigning them a time. This representation seemed very promising. Path relinking is also easy to implement for our local search, be- 46
  • 47. cause an arbitrary metric can be used: even one that just compares the similarity between two solutions. 6 Conclusion Simulation is an important part of solving a modeling problem because it gives the best guarantee of accuracy. Our team provided a simulator that we believe simulates the most important aspects of the problem Sam’s Hauling gave us. We did this early in the semester so that others could use it, and quickly fixed any shortcomings we found. The code also has a mathematical model that tries to describe what the code does as accurately as possible. When we completed the simulator we moved on to other interesting questions. Although we were not able to find a difference in solution robustness with respect to dynamic requests, we were able to find a difference in solutions based on whether dumpster size constraints were relaxed. Through this process we also learned how to collaborate as a team, and with the other teams to produce the best results. This project also taught us how real world problem are very complicated, and usually do not reduce to a text book problem description. However, we were able to use what we know of well studied problems to build a simulator, model, and solver for the problem Sam’s Hauling proposed. Given more time we have plenty more ideas we could try. These include writing a better solver, and a more thorough investigation of how initial inventory distributions affect the route times. 47
  • 48. A City Statistics This table provides the percentages of operations, dumpster sizes, and time windows. 10/06 10/07 10/08 10/09 10/10 D 28.3 D 43.1 D 37.3 D 34.7 D 41 P 66 P 48.3 P 49 P 42.9 P 52.4 R 5.7 R 8.6 R 13.7 R 22.4 R 6.6 6 28.3 6 27.6 6 25.5 6 22.4 6 26.2 9 35.8 9 32.8 9 37.2 9 18.4 9 29.5 12 20.8 12 22.4 12 31.4 12 40.8 12 21.3 16 15.1 16 17.2 16 5.9 16 18.4 16 23 AM 22.2 AM 10 AM 7.7 AM 10.7 AM 24.1 PM 0 PM 3.3 PM 0 PM 7.1 PM 10.4 OPEN 77.8 OPEN 86.7 OPEN 92.3 OPEN 82.2 OPEN 65.5 48
  • 49. 10/14 10/15 10/16 10/17 D 35.5 D 50 D 37 D 37.7 P 56.4 P 40 P 52 P 52.2 R 8.1 R 10 R 11 R 10.1 6 33.9 6 32.8 6 13 6 24.6 9 30.6 9 36.2 9 28.3 9 26.1 12 19.4 12 13.8 12 52.2 12 39.1 16 16.1 16 17.2 16 6.5 16 10.2 AM 11.1 AM 11.4 AM 18.2 AM 6.1 PM 0 PM 0 PM 0 PM 3 OPEN 88.9 OPEN 88.6 OPEN 81.8 OPEN 90.9 49