Todayâ€™s author: Chris Rae, a program manager on the Excel team. Chris is going to discuss iteration in Excel.
Iteration is a powerful feature in Excel, but not one that is familiar to a large number of users. Iâ€™m going to use iteration in Excel to create something which will be familiar to many people â€“ Conwayâ€™s Game of Life.
Much the same as iteration in any programming language, thereâ€™s a small leap to be made to understand the basic principle of interation in a spreadsheet. An iterative spreadsheet is one which no longer simply produces the answer given a set of inputs – it is an evolving process. Each time you calculate the spreadsheet, it will produce a new answer based upon the current data it contains. Iteration has been in spreadsheets for a long time â€“ it was in Lotus 1-2-3 and was a feature in Excel’s predecessor, Multiplan. To use iteration and retain all of your hair you need to have a much finer control over when Excel calculates, so letâ€™s go ahead and switch into manual calculation mode by clicking the Office Button, then “Excel Options” and in the “Formulas” section under “Workbook Calculation”, check the â€œManualâ€ flag. And while weâ€™re here, letâ€™s switch on iteration by checking â€œEnable ierative calculationâ€, and setting â€œMaximum Iterationsâ€ to 1.
Now, into cell A1 enter the formula =A1+1.
This, of course, is a circular reference, but thereâ€™s no warning message. Excel turns the message off because circular references are the bread and butter of an iterative spreadsheet – each time we calculate it we want to base the results on the previous values, so circular references are a necessity. The result of that formula should show up as 1, because A1 was empty before the formula was entered, and empty cells evaluate to zero. Press the F9 key to recalculate the spreadsheet again â€“ A1 will become 2, and so on. If you went back into the options and turned the number of iterations up, the workook would be iterating several times per recalc and the value of A1 would jump in larger steps.
The next problem is how to restart the counter â€“ this is most easily done with a â€œresetâ€ flag somewhere on the sheet. So into A2 enter TRUE, hit Ctrl-F3 to bring up the Name Manager and give that cell the name â€œresetâ€. Change A1 to read =IF(reset,0,A1+1). Recalculate the spreadsheet once, and A1 will become zero. Type FALSE into A2 and recalculate â€“ now the count has started again.
The Game of Life
Conwayâ€™s Game of Life is a simulation involving very simple â€œlife formsâ€ on an infinite grid. There is an excellent Wikipedia article about it with some history, a detailed explanation and some examples. Essentially, each cell in the grid is determined to be either be alive or dead based on a particular set of criteria related to its immediate neighbours â€“ these being the three cells above it, the ones to the left and right and the three below.
The criteria for life or death are:
- A live cell with zero or one neighbours will die from loneliness
- A live cell with more than three neighbours will die from overcrowding
- A live cell with two or three neighbours will remain alive
- A dead cell with three live neighbours will spring into life
Itâ€™s quite possible to simulate this game using a pen and some paper, but itâ€™s a little time-consuming.
Weâ€™re going to create an iterative spreadsheet with each cell representing one, erm, cell in Conwayâ€™s Game of Life. Inside each will be a formula which determines whether it is alive or dead at the end of the iteration, and each time we calculate the spreadsheet weâ€™ll perform one more iteration in the game.
To keep our â€œalive or deadâ€ formula simple, letâ€™s assume that the named range â€œnborsâ€ represents the number of neighbours a cell has. And letâ€™s assume that TRUE and FALSE are used to represent the living state of each cell. Based on the above table, we now have the logic:
|Cell State||Number of Neighbours||Resulting State|
|TRUE||1 or fewer||FALSE|
|TRUE||4 or more||FALSE|
|FALSE||anything except 3||FALSE|
Letâ€™s deal first with what to do in the case where the current cell is alive. Excel has the CHOOSE function, which will pick the nth item from an array â€“ so =CHOOSE(1,â€pigâ€,â€dogâ€) returns â€œpigâ€, and =CHOOSE(2,â€pigâ€,â€dogâ€) returns â€œdogâ€. If the cell is alive, we can use CHOOSE to pick its resulting state using the number of neighbours as an index. Each cell has a minimum of zero and a maximum eight possible neighbours, so our new cell value could look something like:
|the CHOOSE function starts its indexing at 1, but the lowest number of nbors is 0||Value if 0||Value if 1||Value if 2||Value if 3||Value if 4||Value if 5||Value if 6||Value if 7||Value if 8||Value if 9|
Thereâ€™s a bit of wastage in there because the CHOOSE function doesnâ€™t have any ability to understand â€œ4 or moreâ€. To get around this, we can use the MIN function to cap the number if it was 4 or more. So we can shorten the above formula to the following (remember weâ€™re using nbors+1):
|The number of neighbours, capped at 4 and increased by 1 for the CHOOSE function||Value if 0||Value if 1||Value if 2||Value if 3||Value if 4 or more|
So thatâ€™s the case dealt with where the cell is alive. If itâ€™s dead, we need to bring it alive if it has exactly three neighbours. All the formulas here will relate to the spreadsheet cell B2 â€“ the reason being that the very top-left cell becomes a bit more complex when trying to count its neighbours, which is something Iâ€™ll deal with later. We can use a circular reference to check the current state â€“ if the cell is alive weâ€™ll use the CHOOSE() formula above to set our state; if itâ€™s dead itâ€™ll become alive if nbors is exactly 3. We could enter:
In actual fact the formula â€œnbors=3â€ is a statement which itself evaluates to TRUE or FALSE. Whenever you want the results of an IF() to be TRUE or FALSE, you can just write the condition in instead of the whole IF(x,TRUE,FALSE) thing, so:
Itâ€™s mostly for this reason that TRUE and FALSE are better things to use for spreadsheets like this than, say, 1 and 0, or â€œxâ€ and â€œâ€.
So we now have a formula for one iteration of the game, but weâ€™ve no real way to tell the simulator which values to actually use to start off with. So let’s create a new worksheet called â€œtemplateâ€, and type some values to start the process. The sample object here is whatâ€™s called a â€œgliderâ€ â€“ itâ€™s a cyclical formation which will move forever.
Letâ€™s add a reset cell to our spreadsheet using Ctrl-F3 as before. We can now extend our cell formula in B2 on the main grid sheet to read the starter values from the template if the reset flag is set:
We now have the complete formula for one cell in our game board. Right now itâ€™s #NAME? though, because we donâ€™t have â€œnborsâ€ defined. As I mentioned earlier, a cellâ€™s neighbours are all the cells which border it. For B2 this would be A1, B1, C1 above, then A2 and C2 to the sides, and then A3, B3 and C3 below. If you use TRUE and FALSE as if they were numbers they will evaluate to 1 and 0, so adding TRUE/FALSE cells together using â€œ+â€ will return the number of TRUEs in the given cells. So a tempting way to define â€œnborsâ€ would be to select cell B2, then create new name which evaluated to:
Note the lack of $ signsâ€“ this name contains relative references, and will point to a different range when a different cell is selected. If you choose cell C3 and then bring up the name manager, youâ€™ll see:
Anyway â€“ at any particular point in the grid, this will return how many neighbours the cell has. Thereâ€™s a problem, though. The Game of Life is played on a board which flips state in one single move, and unfortunately what weâ€™ve made is a board which will change as we iterate through the cells establishing their new values. The way the game is supposed to work, no changes are made as you run through the cells â€“ you just work out what the new values will be and then set them all at once at the end of the pass. Fiddlesticks.
We need to find some way of saying to Excel that the number of neighbours the cell has should come from the board as it stood at the beginning of the iteration, not from the board as it stands now with us in the middle of modifying it. To do this we can take advantage of one of the powerful foibles of iterative calculation mode â€“ in iterative mode, Excel will calculate worksheets one by one in alphabetical order.
So – to get around this problem, we can create a new sheet containing the number of neighbours for each cell and we know that this sheet wonâ€™t be calculated until the whole of the main grid has finished. Letâ€™s go ahead and rename our main sheet to â€œ1.runâ€ and create a new â€œ2.nborsâ€ worksheet. Iâ€™ve used the numeric prefixes to remind myself that these sheets need to be calculated in that order. In actual fact in this instance it doesnâ€™t matter in which order theyâ€™re calculated, but for other iterative sheets it might well, and I find it makes debugging easier.
Into B2 on the new â€œ2.nborsâ€ sheet we can now enter a formula to count the number of neighbours that â€˜1.runâ€™!B2 has. B2 is going to be the top left of our board and as such some of the surrounding cells are going to be empty â€“ letâ€™s ignore this for the moment and fix it later. The formula I was intending using above now becomes:
This isnâ€™t the prettiest formula. Because iterative calc allows circular references, we can simplify it by just including B2 itself, and taking account of that in our original single-cell formula. So we could change that formula to:
Well, no. Unfortunately SUM doesnâ€™t treat boolean TRUE/FALSE values quite the same as â€œ+â€ does â€“ SUM only sums proper numeric values, and so we get zero from summing TRUEs. A handy trick here is to use Excelâ€™s SIGN() function. This returns 1 if a number is positive and 0 if itâ€™s zero â€“ it is quite happy taking booleans, so =SIGN(TRUE) is 1 and =SIGN(FALSE) is 0. We can array-enter this to make it act upon a range of booleans, and then SUM the results. If youâ€™re new to array formulas I donâ€™t have room to go into too much detail here â€“ esentially they are formulas which can act upon an array of cells sequentially, and also return more than one result. If you search for â€œarray formulasâ€ in Excelâ€™s help there are some good examples.
Still working on cell â€˜2.nborsâ€™!B2 we can array-enter (Ctrl-Shift-Enter) the following formula to sum all of the TRUE values in A1:C3 on the 1.run sheet:
A handy tip (especially handy for array formulas) is to use F9 in the formula bar to evaluate parts of your formula â€“ if you highlight just the SIGN() portion of that formula:
And hit F9:
It shows the results of the SIGN() function. If you then highlight the SUM segment:
And F9 again:
You can see it evaluated the SUM portion too. Hit escape now to cancel editing the formula.
Now that â€œnborsâ€ isnâ€™t exactly the number of neighbours any more (remember weâ€™re including the cell itself now) weâ€™ll have to go back to our original formula and fix it up. Thereâ€™s no change needed if the cell is dead, but now if the cell is alive the number of neighbours is going to be greater by 1. We can change our original cell formula (on â€˜1.runâ€™!B2) to stop adding 1 to nbors in the â€œif aliveâ€ case:
We should now be able to populate some more cells and run the sheet. Extend the formulas on 1.run and 2.nbors right and downwards to make them into a bigger board, change the reset flag to TRUE and hit F9 to reset the sheet to the starting values. Let’s make the live cells a bit clearer to spot – bring up the conditional formatting dialog (Home..Styles..Conditional Formatting..New Rule..Format only cells that contain) and format B2 to have a dark background if its value is TRUE. You can now copy/paste this format onto the rest of the board.
Now change the reset flag to FALSE and hit F9 a few times â€“ successive recalculates will show the iteration sequence in the game.
Thereâ€™s only one problem left â€“ the Game of Life is supposed to be played on an infinite board, but what weâ€™ve created is one with edges â€“ any shape which slips off the edge of the board will vanish forever, whereas what we really ought to do is have it come on the other side. We can effect this by modifying the 2.nbors sheet â€“ the way I found to do this isnâ€™t overly pretty so Iâ€™d appreciate other suggestions. On the 2.nbors sheet, B2 is at the top left so the cells above and to the left of it arenâ€™t actually on the board at all. We really want the cells above it to come from the bottom row and the cells to the left of it to come from the rightmost column. We can do this by simply switching back to using â€œ+â€ for the corner cells â€“ in my somewhat arbitrary 40×35 board, B2 becomes:
I know, I know. Itâ€™s not pretty. Perhaps something nicer could be made with INDEX() and ROW()/COLUMN(). We need to do a similar wraparound at the other corners and a less complicated one just using two SUM(SIGN())s for the top, left, right and bottom rows. Now when 1.run!B2 needs to know how many neighbours it has, weâ€™ll accurately be taking into account the ones from the opposite edge of our â€œinfiniteâ€ board.
Now that there are more rows and columns in Excel 2007, we can run a bigger simulation â€“ the one below is 500×500. The row and column headers look a little odd as Iâ€™ve had to make the rows heights and column widths rather small.
And there we have it. There are a few websites with interesting Game of Life models on them â€“ The Internet Encyclopedia of Science has some good simple ones, and there are more on the Wikipedia page. Here is the workbook I created during this post â€“ itâ€™s in XLS format and will work fine in Excel versions 97 and above. If you extend this at all, I’d be interested to see the results.