Back
Excel

Tables Part 3: Using Formulas with Tables

One of our goals with tables was to create a set of features that reduce the overall maintenance required to keep a spreadsheet functioning well over time.  This involves making spreadsheets less prone to error, as well as making them more understandable days, months, and years after the spreadsheet was created.  Rethinking the interaction between tables and formulas proved to be an important part of meeting that goal.

As many readers have presaged, Excel 12 provides some new ways to reference tables and parts of tables.  We refer to our work in this area as “structured referencing†(that is a working title, so it may be called something else when we ship the product).  In a nutshell, the structured referencing feature allows you to reference a table and/or subsets of the table directly by name as opposed to by cell coordinates.  The feature is similar in concept to named ranges with a few crucial differences.  First, the names that can be referenced are automatically generated when the table is created.  Specifically, this includes the name of the table itself (which by default is something like “Table1â€), and the names of all the columns.  Also, the names are automatically removed as columns are deleted or the entire table is deleted.  Finally, and perhaps most importantly, the names automatically adjust as the table grows and shrinks.  As a result, the majority of the headaches of maintaining named ranges go away with structured referencing.

So how is all this manifested in the product?  Structured referencing represents an addition to the syntax for formulas in Excel.  Here are the basics of how it works.

  • A reference to a table looks like this: =Table1, so if you wanted to sum the values in a table, you could use =SUM(Table1).  Note that =Table1 returns all of the data in Table1 without the headers – this is because many of the common functions that work on ranges, like VLOOKUP, assume no headers. 
  • A reference to a column looks like this: =Table1[Column1].  Again, this reference returns just the data.  So, for example, if you wanted to SUM a column, you could type =SUM(Sales[2004]).


(Click to enlarge)

There’s more to the syntax than that, but first I want to talk about something I personally love – integration between structured referencing and Formula AutoComplete.  The Formula AutoComplete feature I talked about a week ago is fully integrated with structured references – meaning Formula AutoComplete for tables names as well as columns within tables is possible.  For example, here is what it would look like to build the structured reference pictured above.  First, let’s start with just a table.


(Click to enlarge)

Next, I type “=SUM(Sâ€.  You will note this lists everything starting with “S†in the Formula AutoComplete drop-down, including formulas, named ranges, and my table, “Salesâ€.


(Click to enlarge)

After I select my table “Salesâ€, I can now see all the columns in the table.


(Click to enlarge)

And with a single click or keystroke, Excel 12 completes the reference.


(Click to enlarge)

All I need to do is close the parenthesis and press enter and I am done.  We believe this will improve accuracy and efficiency.

Structured references can be used inside tables as well, and here Excel has even more new behaviors.  Say I’m looking at some sales figures and I want to calculate the percentage change in sales over the last few years for each row in my table.  My table looks something like the screenshot below.


(Click to enlarge)

The formula I want is something like: =(FY05-FY02)/FY05.  The first thing I want to do is add a column, so I start typing “Percent Growth†after the last column in my table.


(Click to enlarge)

When I press enter to commit the change, Excel adds another column to my table – the appropriate formatting is applied to the rows, the border around the table adjusts, etc.  So far so good.


(Click to enlarge)

The next step is to write the formula.  I might do something like type “=(” and then use the keyboard to move selection over to the cell I want to calculate – in this case FY05.  Excel gives me the following reference: [FY05].


(Click to enlarge)

As you can see, when referencing a table from within the table itself it is not necessary to prefix the table name.  As I use other columns in my formula, they get similar references.


(Click to enlarge)

When I am finished with my formula and press ENTER, Excel automatically fills that formula down for all rows in the Percent Growth column. 


(Click to enlarge)

This is another new feature of tables called calculated columns.  Any time a formula is entered into an empty table column it will automatically fill.  We think this will help reduce errors introduced by manual filling or copy/paste.  Not only does it fill, but it continues to fill down as you add or delete rows in the table.  This is another one of those “sticky†properties I mentioned in my previous post.  As with other table features, the user does have control over its behavior – at the time you enter a formula in a cell in a calculated column, the auto-fill can be undone using on-object UI (which you can see in the previous screen shot).  Further, if you really do not want columns filling down ever, the feature itself can be turned off.

If the calculated column ever needs to be updated, it is only necessary to edit one copy of the formula and the change will propagate to all rows.  In addition, it is possible to change any single row (e.g. enter a static value or custom formula) so that it is not consistent with the calculated column.  Such cells will be flagged visually so that inconsistencies will be easy to spot.  For example, if I enter =RAND() in the middle of my column and choose not to fill down, I see a green triangle in the upper left-hand corner of the cell.


(Click to enlarge)

Furthermore, once a calculated column contains inconsistencies, subsequent edits to cells in the calculated column do not propagate because Excel does not want to overwrite custom values.  However, the same UI shown above will appear allowing the user to opt-in to the behavior.

The last feature I want to talk about is the table totals row.  (For those that use Excel 2003 Lists, you will recognize the totals row, but we have made some nifty improvements since 2003, so please read on.)  The totals row is another special area of the table, like the header row.  It lives at the bottom of the table and its purpose is to calculate totals for the columns in a table.


(Click to enlarge)

The total row can be enabled via the table tab in the ribbon, right-click on the table, or by using the existing AutoSum functionality.  The nice thing about having a special area for totals in the table is that it participates in the table’s activities.  It grows and shrinks properly with the table, it gets special “total row†formatting when you apply a style to the table (more on table styles in a later post), and the formatting behaves appropriately as more columns are added to the table.

The total row cells can contain pretty much any kind of formula (the formulas don’t have to reference the table at all but we don’t expect that to be the common case) as well as text labels.  The cells in the totals row also contain a dropdown that shows you some of the most commonly used functions, lowering the bar for the new Excel user to write formulas. 


(Click to enlarge)

The total row can be toggled on and off, and any functions that exist in the total row will be “remembered†until the next time you turn it back on.

So now let’s finish our discussion of syntax, since some of you are probably asking questions like “how do I reference the entire table, headers and all?† Referencing the entire table can be done this way: =Table1[#All].  Or if you just want to reference the headers: =Table1[#Headers].  To reference the entire column: =Table1[[#All], [Column1]].  To reference just the header value of a column: =Table1[[#Headers], [Column1]].  The special keywords can also be combined: =Table1[[#Headers],[#Data],[Column1]].  At this point I think you get the idea. 

Some other points about structured references: table names are globally unique to a workbook and do not require a sheet reference like Sheet1!Table1.  Table column names must be unique within the table itself.  Table names co-exist in the same namespace as named ranges, meaning you can’t have a defined name with the same name as a table, and vice-versa.  If the table name or any of the column names change then any formulas that reference those names will automatically update as well.  Structured references can be created using selection with the mouse (and yes, there is an option to turn this off).  In fact, structured selection (I talked about it in my last post) is one way to very quickly generate structured references when writing formulas. 

That’s it for now.  For my next post I will delve into the work we’ve done around the area of autofilter.  Multi-select anyone?