Today’s author, Diego Oppenheimer, a Program Manager on the Excel team, delves into some of the more advanced functionality of Excel to solve a challenging problem. The file used in this blog post can be found in the attachments at the bottom of this post.
Throughout this next blog post I will be using a tool that comes with Excel called Solver. For more information on solver please visit the Office Online site (http://office.microsoft.com/enus/excel/HP051983681033.aspx?pid=CH010004571033).
Background:
I am currently planning a vacation drive exploring the state of Washington. Ideally in the short amount of time I have off I would like to do as many hikes as possible. I will be starting from Seattle and finishing at Kettle Falls in eastern WA. Along the way I have identified 4 possible stopping places that I am interested in. From stop to stop I have researched how many hikes I can do along the way between each one of my stops. With a tool like Microsoft Live Maps I could easily plot the shortest distance between Seattle and Kettle Falls, but I am more interested in optimizing my trip to include as many hikes as possible.
For this purpose I have decided to use simple principles of linear programming and create a network model with Excel that would maximize my hiking time on my upcoming vacation. This problem in graph theory is defined as a shortest path problem. Shortest path algorithms are used in web mapping software such as Live Maps.
For a better graphical understanding of my journey I created the following map:
Without ever “moving backwards” I have routed all possible combinations from stop to stop from Seattle, WA to Kettle Falls, WA. I have also numbered my stops from left to right for easier referencing in my workbook.
I built the table below for a quick reference on driving time between the different stopping points. You will notice that I have put a number in brackets (X) representing the number of hikes on that particular part of the journey:
Ex: Seattle to Darrington – Time: 1.4 hrs with 2 hikes that I would be interesting in doing.

Seattle 
Longmire 
Darrington 
Yakima 
Wenatchee 
Kettle Falls 
Seattle 
X 
4 (5) 
1.4 (2) 
2.2 (3) 
2.5 (5) 
5.5 (6) 
Longmire 
4 (5) 
X 
4.2 (2) 
2.3 (2) 
4(2) 
6.3 (4) 
Darrington 
1.4(2) 
4.2 (2) 
X 
3.4 (3) 
3.3 (5) 
7 (3) 
Yakima 
2.2 (3) 
2.3 (2) 
3.4 (3) 
X 
2(1) 
4.4 (3) 
Wenatchee 
2.5 (5) 
4(2) 
3.3 (5) 
2(1) 
X 
4.2 (2) 
Kettle Falls 
5.5 (6) 
6.3 (4) 
7 (3) 
4.4 (3) 
4.2 (2) 
x 
The Setup:
Given that I love hiking I wanted to set up this optimizer so that it could be reused for any amount of stops I wanted. I’ve only included 6 in this version to maintain simplicity.
First let’s create a table that numbers and labels our stop so we create the following table starting at O3:
Stop 
Name 
1 
Seattle 
2 
Longmire 
3 
Darrington 
4 
Yakima 
5 
Wenatchee 
6 
Kettle Falls 
Great now let’s create our possible routes table. This table will represent the paths I drew out on my map and will also contain information on estimate time (in hours) and number of hikes.
We can see that now the table has all the possible routes I can take and also the estimated time between each path as well as the number of hikes on each path. We can now on B3 add our route selector which will be a binary switch (1 for True, 0 for False) that will represent if a path was chosen or not. For now let’s initialize all these values to 0.
Solver will change our boolean values in the Route Select column so an easy way to find out how many totals hours and how many totals hikes I can do in my optimal path I can use the function SUMPRODUCT.
SUMPRODUCT will return the sum of the products of the two ranges I give it in this case (Route Select x Number of Hikes).
So let’s label C17 as “Number of Hikes” and insert the following formula in C18:
=SUMPRODUCT(Route Selector X Number of Hikes)
Or in this example
Where Table1 is our possible route tables, Route Selector is the range from B3:B12 and Number of Hikes is the range from D4:D12.
Ok so to recap we have our possible routes table and we still have on O3 our stop numbers and stop names.
Next step is to create the constraint system that is necessary for solving this problem. In simple English we need to create constraints so that solver will only find solutions that follow these rules:
 I always leave Seattle,WA.
 I always arrive at Kettle Falls,WA.
 If I arrive at a certain node the next step has to be leaving that node. (With the exception of Seattle and Kettle Falls).
The easiest way to do this is to create once again a binary selector where +1 indicates arriving at a location and 1 indicates leaving a location. Our new network map would be something like this:
And last we will define our flow column by using SUMIFs. SUMIF will add the cells specific by a given set of conditions. In our case the condition is that all our stops “net flow” are 0 except our starting point and ending point.
=SUMIF(ToStopRange, Stop , Route Select Range) – SUMIF (FromStopRange, Stop, RouteSelect Range).
So on Q3 we will start a range titled “Flow”. We will define our flow column by using SUMIFs. SUMIF will add the cells specific by a given set of conditions. In our case the condition is that all our stops “net flow” are 0 except our starting point and ending point.
And we copy this formula down from Q3 to Q6.
Since the flow range is determined by the values in the route selector we still need to give solver a reference range to match the flow I indicated in my map diagram. So adjacent to Flow I will create my constraint column and manually input the values 1 for Seattle, 0 for all my middle nodes and +1 for Kettle Falls,WA.
Setting Up Solver.Finally
To enable solver: http://blogs.msdn.com/excel/archive/2006/09/06/743902.aspx
We have 3 tasks to do when setting up solver:
Set Target Cell: Here we will select the cell right next to number of hikes since we want to maximize this number we select “Max”.
By Changing Variable Cells:
The cells to change are our Route Selector Range.
Constraints:
Here we will specify the constraints that we discussed before:
 The Flow column must be equal to the constraints column. Select Flow Range = Constraint Range
 The Route Select Range has to be >= 0 (we only want positive number).
 Route Selector must contain binary values.
And we are finally done with setting up our solver. No we just click “Solve” and done. we now have an optimized hiking travel plan.
If you copied all the number I used you should be getting a maximum of 14 hikes and your ideal travel path is:
 Seattle to Longmire
 Longmire to Darrington
 Darrington to Wenatchee
 Wenatchee to Kettle Falls
Conclusion:
So now we have built a path optimizer that an easily be expanded (add more nodes) or changed around to optimize other variables like Estimated travel time. Even though this example is fairly simple (a map and some simple arithmetic would have provided faster results) the value of these methods comes into play when analyzing large amount of nodes. Similar principles to those discussed in this post are used for optimizing all sorts of manufacturing, transportation and distribution systems.