Fun With Tables

A few weeks ago when I asked readers what they wanted to see in the way of content over the next few months, a few people suggested some discussion about everyday use.  Accordingly, I am going to write today about some things I was doing with Tables in Excel 2007, and tomorrow I will try to write a bit about conditional formatting.

Specifically, what I want to walk through with regard to tables was using calculated columns and structured referencing.  (For those that are new to the blog or who need a refresher, a post from last year on these items is located here.)  In this case, I was working with two tables of data (imported from another data source) and I wanted to add some additional calculated columns and then add a lookup from one column to another – a pretty standard activity.  Let’s take a look at how this can work in Excel 2007.

Here is one of the tables (note, the data is completely fabricated for the purposes of this blog entry).  It contains a list of blogs I was looking up and their traffic via both RSS feeds and normal page views for the month of June.  The other table, on the second sheet, is exactly the same, except that the data is for the month of May.


(Click to enlarge)

What I want to do is the following:

  • Figure out the ratio of RSS:Page views
  • Add a percentile column
  • Figure out the rank for each month
  • Figure out how the rank changed between the two months

The first thing I did it was to turn the table into a Table (new for Excel 2007, newer readers can find a series of posts on Tables here) using CTRL+T.  I did the same thing for the table on the second tab, and named them “June_2006” and “May_2006”.


(Click to enlarge)

The next thing I did was added another column to the end of the table by typing “Ratio” beside the other table headers and pressing Enter.  Excel added a column to the table and took care of the formatting.


(Click to enlarge)

To add the calculation, I simply needed to enter the names of the columns (either by typing or by using the arrow keys or mouse) thus: =[RSS Views]/[Web Views]


(Click to enlarge)

When I press Enter, Excel fills down the calculation across all the rows in the Table.  So that’s it – a few keystrokes to add a calculated column to the table, formatting and all, and you can sede that Excel uses Table column names and not references in the forumula.


(Click to enlarge)

Now those of you that clicked on the above picture probably noticed that there was a #DIV/0! Error down on line 20.  To address that problem, I went back and updated the formula that I just entered to use the new IFERROR function that was added in Excel 2007.` 


(Click to enlarge)

Which simply says that if there is an error, replace the error value with the string “0 Web Views”.

Next up on the agenda is adding a percentile and a rank column.  Since these are both pretty similar, I will only go over one, adding a percentile column.  Again, the first thing to do is to add a new column in the same way I added one above.  The next thing to do is to start typing the name of the function I want to use … in this case, PERCENTILERANK.  Notice that as soon as I type “=”, a list of everything that I might want to enter at this point – functions, Tables, Defined Names – is brought up in a list (Formula Autocomplete – see this blog post for details).  As I type more, the list is narrowed down based on what I type.  If I type “=p” for example, I see this:


(Click to enlarge)

I think simply have to select the function I want (mouse, keyboard, further typing) and I can start entering arguments.  In this case, I want the PERCENTRANK of the Total Views column, so I simply enter that for both the array of values as well as the specific value in question (the x value).


(Click to enlarge)

What’s going on here?  Excel is clever enough to figure out the implicit intersection of the x value.  When I press enter, again, the column is filled down.  Adding a column to rank values is exactly the same, except I use the RANK function.  Here is a picture of the table after I have added the two new columns and done a bit of formatting.


(Click to enlarge)

The final thing that I want to do with the table is figure out how the rank has changed from the previous month (I added the same to calculated columns to the other table).  In this case, I am going to use a VLOOKUP function and a MATCH function to get this data, like this:


(Click to enlarge)

Note, however, that instead of using cell references, I am using references to the Tables, Table columns, and Table headers I am using.  Specifically, I am looking for the value in the “Blog” column, I am looking for that value in the “May_2006” table, and I am subtracting the “Rank” column at the end of the formula to figure out the difference between the two months.  In addition, I am using the MATCH function to find the position of the “Rank” header in the “May_2006” Table header (that’s what “May_2006[#Headers] returns – a reference to the headers in the specified table … you can also return references to #Totals, #Data, #All, or #ThisRow). 

The net effect of all this is that my formula is a lot easier to read/audit, and a lot less fragile, since I don’t have to worry about whether more data shows up on refresh, headers being moved around, etc.

After another IFERROR function and some conditional formatting, I am pretty much done.


(Click to enlarge)

Hopefully that was somewhat interesting.  Tomorrow, I will try and put together another conditional formatting example.

Office Blogs Comments

Comments: (10) Collapse

  • That's really cool, although your last function (for comparing the two tables) gets a little confusing. But that might be because I don't know how the vlookup function works.

    Still, this is an excellent example of I might end up using this stuff.

    I noticed you included a delta character; is that new to Excel 2007, or have I just not noticed it before? Does \delta autocorrect to it?

    Shiny. I can't wait.

  • Hi,

    i'm trying to do it, but i can't get thelast part work, may you send me the file?

    betancur[AT]gmail[DOTcomm

    Thanks.

  • How do we know what the name of a table is? Is May_2006 both a worksheet and a table name? What if a worksheet has multiple tables? How would the VLOOKUP formula know which to choose?

    Also, the implicit Xn in range X0-Xz, as in the PERCENTRANK and RANK functions above, leaves me wondering. Can I use them on data sets such as the following?

    x0 a b c d

    x1 e f g h

    x2 i j k l

    y0 a b c d

    y1 e f g h

    y2 i j k l

    z0 a b c d...

    (x, y, and z are separate cases; a-l are variables for each case)

    In the past, I have created functions that operate on, say, variable k (by only referencing every third row.) Is this possible with Excel's implicit logic (PERCENTILERANK and RANK above) or will I still need to specify cells/ranges by coordinates?

  • To Kdbertel

    You can insert the delta sign from Insert/Symbol

    To Jose Alejandro Betancur

    Try the function bellow. If it doesn't work I will send you the file.

    =IFERROR(VLOOKUP([Blog],May_2006,MATCH("Rank",May_2006[#Headers],0),FALSE)-[Rank],"No May Value")

    To Francis

    Just put your cursor in any cell of that table. For example on the cell with "Blog 14" and then press CTRL+T. In 2003 it was in Insert/Name/Define Name (not sure, it's by memory). You can also define a name of some selected cells by selecting them (the range of cells) and then in the upper left cell where usually you can find the current active coordinates (F9, B3 etc) and write there the desired name. And then when you will select that excat cells you'll see their name in that upper left cell. Hope it helps.

  • First off, autocompletion for structured refs has a bug when refering to specified tables. I created two tables named new and old, then started writing a column formula in new in which I wanted to refer to old[ID]. I started off typing old[. At that point Excel displayed a drop-down in which one item was [Rank]. I typed r, so my reference then appeared as old[r. Then I pressed [Tab], and Excel munged this into old[[Rank] -- note the *2* left square brackets. Will that be fixed in the beta refresh?

    Next, AND A MUCH BIGGER PROBLEM, with the new table in Sheet1 and the old table in Sheet2, with both tables spanning A1:B11 [#All], the following formula entered in Sheet1!E9,

    =MID(CELL("Address",(Sheet2!A1:B11,Sheet2!A1:B11)),FIND(",",CELL("Address",(Sheet2!A1:B11,Sheet2!A1:B11)))+1,64)

    correctly returns $A$1:$B$11. However, the following formula in Sheet1!E10,

    =MID(CELL("Address",(old[#All],old[#All])),FIND(",",CELL("Address",(old[#All],old[#All])))+1,64)

    returns a #REF! error. Further, the 3 formulas

    =COUNTA((Sheet2!A1:B11,Sheet2!A1:B11))

    =COUNTA(old[#All],old[#All])

    =COUNTA((old[#All],old[#All]))

    return 44, 44 and 1, respectively, when entered in Sheet1, but all three evaluate to 44 when entered in Sheet2. Are structured references resolved as range references only within the worksheet containing the specified table and to arrays in other worksheets? But the formula

    =ISREF(old)

    evaluates to TRUE in both Sheet1 and Sheet2.

    So what gives about using structured references to tables in other worksheets?

  • Good god, when can I get rid of the insipid A, B, C... and stop wasting the first row of my spreadsheet to labels? Lotus Improv had the ability to custom name column and rows, something that Excel has yet to attain.

    Instead you would rather provide this strange hack instead of fixing the real problem. Let's move Excel out of 1980 and into 2000's.

  • Thanks everyone for the comments.

    Jose, you can do this with any two tables – if you have the beta, you can just create them, name them what you want, and they experiment.

    Francis, I should have named the tabs something different, sorry.  Tables can be given names using the table contextual ribbon (so can PivotTables).  They have to be unique in a workbook.  I don’t fully understand your second question – if you would like to send me a workbook with the problem you are trying to solve, I can take a look.

    Vitalie, thanks for helping out.  One clarification – to name a table, as I mentioned in the response to Francis.  If you name the cells that make up the table using the name box, you are creating a defined name, which is not exactly the same thing – the table name can be used in structured referencing to access different parts of the table (data, headers, all), while the same is not true if you just create a defined name.  This is a subtle but interesting usability point I didn’t think about before, so thanks for bringing it up.

    Harlan, I am happy to report those are all bugs in the beta that have been fixed.  WRT the first bug, you will now see “old[Rank”.  WRT the other items, they all now return the appropriate result ($A$1:$B$11 and 44) … structured references should resolve to range references anywhere in the workbook (or cross workbooks).

    BRG, I am not quite clear what you would like to see, but I am happy to hear specific feedback.

  • Yes, I saw that after posting when I tried with another example and with a "named" table.

    He asked about table name and I quickly wrote how he can "name" a table or a range of cells.

    Thanks for your replay, I'm eager to read today's article :)

  • I know that this example is primarily designed to show off the Table features of Excel 2007 but I'd be careful about following this example when you have lots of different months for which you want to carry out the same analysis

    Where possible, I try to build one model with the necessary formulas and say use the INDEX function to bring the data for different months into the model - this leads to smaller, easier-to-maintain spreadsheets and keeps the data separate from the model

  • yep,

    Thanks, I finally get it to work, the problem was that the second table wasn't completely sorted.

Comments

Comments: (loading) Collapse