Back
Excel

Building a Basic, Understandable Sudoku Solver Using Excel Iterative Calculation – Part 2/2

Today’s author, Charlie Ellis, continues discussing the spreadsheet he built to solve Sudoku puzzles.

In my previous post, I walked through a number of formulas for setting up the valid values and solution board. In this post we’ll cover using iteration and other formula tricks to help solve the puzzle.

Using valid values to drive solutions

Looking at cells P22:R24 below, we can clearly see that the only solution here is 7:

image

Let’s reflect this in the solution board.

We’re going to modify the “else” argument of the IF function here to make it check to see if the val_bigcell_from_sol has only a single answer, and if so, return that. First we need to make val_bigcell_from_sol:

=INDEX(val_board, ROW(Main!A1)*3-2, COLUMN(Main!A1)*3-2):INDEX(val_board, ROW(Main!A1)*3, COLUMN(Main!A1)*3)

Making sure to enter this from the first cell in the solution board (cell D16).  Next we use this to make the else clause:

=IF(in_cell_from_sol,in_cell_from_sol,IF(COUNT(val_bigcell_from_sol)=1, SUM(val_bigcell_from_sol), “”))

Note the trick here to get a value from a 3×3 big cell; we can’t just get its value, but since we know there’s only once cell with a value, we can just use SUM to get the value of the one cell within the big cell that has a value.

And then we enter it and. whoops, we get a big scary warning:

image

Click Cancel, and then go to the tools options and make sure that “Enable iterative calculation” is checked:

image

Now, when you go back to the spreadsheet, it’s worked, and we’ve completely solved this (really easy) Sudoku:

image

To see what it’s doing, you can dial down the maximum iterations to 1 and add a reset to both the solution board and the valid values board. The way I like to do this is with a cell that tracks state, and either has a string that either says “reset” or “solve”. Adding this to a cell (I chose D21) and naming it state, you can modify the valid values board formula and the solution board formula to be:

=IF(state=”reset”,onetonine,IF(sol_cell_from_val<>”",IF(sol_cell_from_val=onetonine, onetonine,”"), IF(solution_in_rcb, “”,onetonine)))

And

=IF(in_cell_from_sol, in_cell_from_sol, IF(state=”Reset”, “”, IF(COUNT(val_bigcell_from_sol)=1, SUM(val_bigcell_from_sol), “”)))

Respectively.

Finding the only cell in a row that can take a given value

Let’s take a slightly tougher puzzle now, and see how we can extend this very rudimentary solver.

image

When you try to solve this puzzle with our existing solver, it gets some answers, but then it bogs down. To solve this puzzle, we need to make use of another rule of solving Sudoku puzzles which is that there must be at least one of each number in every row, column and big box. This rule, which is almost the first rule flipped around helps you get to a solution when there is only one possible place in a whole row, column or big box that a particular number can go.

Before we get to how to actually check for this, let’s talk about the right place to put this logic. Because it is helping us to find a solution as opposed to eliminate a possibility, it should go in the solution board. And because it covers a case that is the non-reset-state case where a solution can’t be found (and the empty string is returned), the right place is where that normally would be. Furthermore, we have to both find a solution and return that solution, so it makes sense to collapse both of those into one name instead of testing for a solution then doing the same work we did to find the solution in order to return it. That means we’re going to change our formula to something like:

=IF(in_cell_from_sol, in_cell_from_sol, IF(state=”Reset”, “”,IF(COUNT(val_bigcell_from_sol)=1, SUM(val_bigcell_from_sol), single_sol_in_rcb)))

Where single_sol_in_rcb is either a solution, if one exists, or the empty string.

To implement this, we’ll need val_bigcell_from_sol, but also val_bigrow_from_sol, val_bigcol_from_sol, and val_bigbox_from_sol. These are all fairly straightforward extensions on the concepts from sol_row_from_val and val_bigrow_from_sol, so I won’t go into detail here, but here’s the formula for val_bigbox_from_sol as entered/viewed from cell D16:

=INDEX(val_board, INT((ROW(Main!A1)- 1) / 3) * 9 + 1, INT((COLUMN(Main!A1)- 1 ) / 3)*9+1):INDEX(val_board, INT((ROW(Main!A1)-1) / 3) * 9 + 9, INT((COLUMN(Main!A1)-1) / 3) * 9 + 9 )

For now, we’re going to start out just checking the rows for cases where there’s only one cell that can contain a given value. Thus we’ll make single_sol_in_rcb just be:

=single_sol_in_row

And then eventually make it so that we can detect a single solution in any of rows, columns, or boxes. BTW, it’s fine to create names that make use of these as-yet-undefined names, but if you put these into the sheet, Excel annoyingly resizes your columns to accommodate the #NAME errors that inevitably result.

Here’s what this looks like on the valid value board:

image

Here the five in the big cell highlighted in red is the only five in the entire row, so the corresponding solution cell must be five. We can break this down into two pieces:

  1. Checking that a value exists within the valid values board big cell corresponding to the current solution cell
  2. Checking that there’s only one of that value in the valid value board big row corresponding to the current solution cell

The most efficient way I’ve gotten to work, is to do these checks and comparisons by creating an array of the values in the current big cell and multiplying that array by which of those values only occur once in a given row. Here’s the formula for this:

=MAX((COUNTIF(val_bigrow_from_sol,array1to9)=1)*(COUNTIF(val_bigcell_from_sol,array1to9)=1)*array1to9)

Where array1to9 is the array {1;2;3;4;5;6;7;8;9}, which is easiest to get using =ROW($A$1:$A$9). The first COUNTIF and equality operator check whether, for each number 1-9, there’s only one of the number in the row. In the above example, this becomes {0;1;0;0;1;1;0;0;0} because each of 2, 5, and 6 occur only once in the highlighted row. The second COUNTIF and equality operator check which values are present in the current cell. In the case of the highlighted cell, this is {0;0;0;1;1;0;0;0;0}. Multiplying them together gets the intersection of those two conditions ({0;0;0;0;1;0;0;0;0}), and multiplying this by array1to9 gives you {0;0;0;0;5;0;0;0;0}, which can be reduced to just the desired value by taking the MAX.

Then, in single_sol_in_rcb, in order to not have to call single_sol_in_row more than once to both return an answer and turn the 0 condition into an empty string, you can use the CHOOSE function like so:

=CHOOSE(single_sol_in_row+1, “”, 1, 2, 3, 4, 5, 6, 7, 8, 9)

This is enough to solve the second sample puzzle, but extending this to columns and boxes is actually really easy at this point. All you have to do is make a single_sol_in_col and single_sol_in_box, which are identical to single_sol_in_row except for the first named range. Then you have to adjust single_sol_in_rcb to include the solutions for column and box like so:

=CHOOSE(MAX(single_sol_in_row, single_sol_in_col, single_sol_in_box)+1, “”, 1, 2, 3, 4, 5, 6, 7, 8, 9)

Well, that’s it. The actual workbook itself is attached to my original post. I think it’s neat in that there are only really three different formulas on the actual sheet and another 18 in the names. If you think about it as lines of code, there aren’t a lot of places outside of obfuscated code contests where you get 21 lines of code to do as much complicated stuff. I think you’ll have to be the judge of how clear the formulas are, but by using the commenting on names and further cleaning up naming conventions, I think this could be very understandable. Certainly there’s a lot more I could walk through in a future post – primarily how to add in more complicated strategies that I’ve gotten to work in this framework such as pairs, triples, hidden pairs, hidden triples, and box-and-line, but also debugging strategies, pitfalls for performance or maintainability, etc. – but that’s the basic idea. I’d love to hear thoughts on this approach, other approaches, suggestions for new strategies or additional tricks to make existing stuff simpler or faster. Hope you found it interesting.