You can use your favorite social network to register or link an existing account:
Or use your email address to register without a social network:
Sign in with these social networks:
Or enter your username and password
Forgot your password?
Yes, please link my existing account with for quick, secure access.
No, I would like to create a new account with my profile information.
One of our goals with tables was to create a set of features that reduce the overall maintenance required to keep a spreadsheet functioning well over time. This involves making spreadsheets less prone to error, as well as making them more understandable days, months, and years after the spreadsheet was created. Rethinking the interaction between tables and formulas proved to be an important part of meeting that goal.
As many readers have presaged, Excel 12 provides some new ways to reference tables and parts of tables. We refer to our work in this area as “structured referencing” (that is a working title, so it may be called something else when we ship the product). In a nutshell, the structured referencing feature allows you to reference a table and/or subsets of the table directly by name as opposed to by cell coordinates. The feature is similar in concept to named ranges with a few crucial differences. First, the names that can be referenced are automatically generated when the table is created. Specifically, this includes the name of the table itself (which by default is something like “Table1”), and the names of all the columns. Also, the names are automatically removed as columns are deleted or the entire table is deleted. Finally, and perhaps most importantly, the names automatically adjust as the table grows and shrinks. As a result, the majority of the headaches of maintaining named ranges go away with structured referencing.
So how is all this manifested in the product? Structured referencing represents an addition to the syntax for formulas in Excel. Here are the basics of how it works.
(Click to enlarge)
There’s more to the syntax than that, but first I want to talk about something I personally love – integration between structured referencing and Formula AutoComplete. The Formula AutoComplete feature I talked about a week ago is fully integrated with structured references – meaning Formula AutoComplete for tables names as well as columns within tables is possible. For example, here is what it would look like to build the structured reference pictured above. First, let’s start with just a table.
Next, I type “=SUM(S”. You will note this lists everything starting with “S” in the Formula AutoComplete drop-down, including formulas, named ranges, and my table, “Sales”.
After I select my table “Sales”, I can now see all the columns in the table.
And with a single click or keystroke, Excel 12 completes the reference.
All I need to do is close the parenthesis and press enter and I am done. We believe this will improve accuracy and efficiency.
Structured references can be used inside tables as well, and here Excel has even more new behaviors. Say I’m looking at some sales figures and I want to calculate the percentage change in sales over the last few years for each row in my table. My table looks something like the screenshot below.
The formula I want is something like: =(FY05-FY02)/FY05. The first thing I want to do is add a column, so I start typing “Percent Growth” after the last column in my table.
When I press enter to commit the change, Excel adds another column to my table – the appropriate formatting is applied to the rows, the border around the table adjusts, etc. So far so good.
The next step is to write the formula. I might do something like type "=(" and then use the keyboard to move selection over to the cell I want to calculate – in this case FY05. Excel gives me the following reference: [FY05].
As you can see, when referencing a table from within the table itself it is not necessary to prefix the table name. As I use other columns in my formula, they get similar references.
When I am finished with my formula and press ENTER, Excel automatically fills that formula down for all rows in the Percent Growth column.
This is another new feature of tables called calculated columns. Any time a formula is entered into an empty table column it will automatically fill. We think this will help reduce errors introduced by manual filling or copy/paste. Not only does it fill, but it continues to fill down as you add or delete rows in the table. This is another one of those “sticky” properties I mentioned in my previous post. As with other table features, the user does have control over its behavior – at the time you enter a formula in a cell in a calculated column, the auto-fill can be undone using on-object UI (which you can see in the previous screen shot). Further, if you really do not want columns filling down ever, the feature itself can be turned off.
If the calculated column ever needs to be updated, it is only necessary to edit one copy of the formula and the change will propagate to all rows. In addition, it is possible to change any single row (e.g. enter a static value or custom formula) so that it is not consistent with the calculated column. Such cells will be flagged visually so that inconsistencies will be easy to spot. For example, if I enter =RAND() in the middle of my column and choose not to fill down, I see a green triangle in the upper left-hand corner of the cell.
Furthermore, once a calculated column contains inconsistencies, subsequent edits to cells in the calculated column do not propagate because Excel does not want to overwrite custom values. However, the same UI shown above will appear allowing the user to opt-in to the behavior.
The last feature I want to talk about is the table totals row. (For those that use Excel 2003 Lists, you will recognize the totals row, but we have made some nifty improvements since 2003, so please read on.) The totals row is another special area of the table, like the header row. It lives at the bottom of the table and its purpose is to calculate totals for the columns in a table.
The total row can be enabled via the table tab in the ribbon, right-click on the table, or by using the existing AutoSum functionality. The nice thing about having a special area for totals in the table is that it participates in the table’s activities. It grows and shrinks properly with the table, it gets special “total row” formatting when you apply a style to the table (more on table styles in a later post), and the formatting behaves appropriately as more columns are added to the table.
The total row cells can contain pretty much any kind of formula (the formulas don’t have to reference the table at all but we don’t expect that to be the common case) as well as text labels. The cells in the totals row also contain a dropdown that shows you some of the most commonly used functions, lowering the bar for the new Excel user to write formulas.
The total row can be toggled on and off, and any functions that exist in the total row will be “remembered” until the next time you turn it back on.
So now let’s finish our discussion of syntax, since some of you are probably asking questions like “how do I reference the entire table, headers and all?” Referencing the entire table can be done this way: =Table1[#All]. Or if you just want to reference the headers: =Table1[#Headers]. To reference the entire column: =Table1[[#All], [Column1]]. To reference just the header value of a column: =Table1[[#Headers], [Column1]]. The special keywords can also be combined: =Table1[[#Headers],[#Data],[Column1]]. At this point I think you get the idea.
Some other points about structured references: table names are globally unique to a workbook and do not require a sheet reference like Sheet1!Table1. Table column names must be unique within the table itself. Table names co-exist in the same namespace as named ranges, meaning you can’t have a defined name with the same name as a table, and vice-versa. If the table name or any of the column names change then any formulas that reference those names will automatically update as well. Structured references can be created using selection with the mouse (and yes, there is an option to turn this off). In fact, structured selection (I talked about it in my last post) is one way to very quickly generate structured references when writing formulas.
That’s it for now. For my next post I will delve into the work we’ve done around the area of autofilter. Multi-select anyone?
RE: Excel Art - Whimsical thought
So who will be the first person to write a macro to read jpegs in with each cell representing 1 pixel.
It's already been done with only 256 columns.
With the new Excel though we can do some serious image analysis - a competitor for PhotoShop?
I've got the link somewhere for jpegs: this one works for BMPs.
I'd take Multi-select anytime! Thanks for the wonderful work!
Quick Q on stucture naming. How does Excel determin the height of the table if there are missing values (blank cells) scattered around the column? This is the biggest problem in the past when using COUNTA($A:$A)-1 methods. Same question applies to auto fill formula...
Will this all be backwards compatible? What if I create a spreadsheet full of table references and names rather than the current sytem of the alphanumeric cell references, and then share it with a client who's not yet running Excel 12? Can they get the formulas to work? Can they make changes? When they send the file back to me, will it work seamlessly?
Salutations … thanks for the comments; let me try and answer some of the questions posed.
Rob – Yes, data validation is copied down (any ‘property’ you can set on a cell is). No, data validation hasn’t changed. What specifically are you looking for? Dynamic referencing like Table[2000+A1] or even Table[A1] is not supported. There is no new facility to tell by formula to tell if a column is in a non-consistent state, but that is an interesting idea.
Hazz – Table name is on the dialog. Thanks for the feedback on the create table dialog. Currently, you can’t reference a table by name, but that is good feedback. Yes, you can use different arguments in SUBTOTAL to determine whether to include visible rows or not. I am not sure I understand your question about formulas filling in rows like they do for columns, but I think you will be able to do what you suggest. The auto-fill behaivour, though, is only for rows. Yes, you can have alternate by shading more than one row at a time.
Charles, Jean Marc, Orion – The only row referencing available is a [#This Row] tag, which allows you to scope a function to a specific row in a table, mainly for the purposes of implicit intersection. Will try and get in an example this week.
Orion – There is a SharePoint blog. http://blogs.msdn.com/pjhough/. VLOOKUP will accept structured references in the first two arguments but not the “index” in Excel 12 – we would need a new function to enable what you typed above, since the “index” argument expects an integer, not a reference. Thanks for the feedback on datatypes.
Jean – Just one Total Row (and Header Row) at the current time, though have heard the question before. You can type any function into the total row, so yes MEDIAN and PERCENTILE will work just fine.
Harlan – Yes, structured references do evaluate to range references, so : will work. Table[:] also works. No [#Missing] or [#Blank], this time out, but great ideas. [#All] is different from [#Headers],[#Data] because [#All] includes the totals row. Totals rows are indicated by [#Totals]. No structured reference can have [#Headers] and [#Totals] in it, so [#All] is the only way to get the entire table. See VLOOKUP comment above. INDIRECT will work with structured references. PivotTables can indeed be fed by tables – you may have noticed the UI in the Table Ribbon that said “Summarize With Pivot”. No, PivotTables cannot be defined as tables – as you could imagine, you would need a much richer referencing scheme to handle multiple dimensions and hierarchy. Structured references into closed files will calc to an error value if you attempt to refresh – much like GETPIVOTDATA does in Excel 2003.
Murray - PivotTables can indeed be fed by tables – you may have noticed the UI in the Table Ribbon that said “Summarize With Pivot”.
Charles – There is a [#Totals] keyword, correct.
Tianwei – The difference between a table and Excel’s existing logic to try and determine table height is that the user declares the region that makes up the table, and from then forward rows and columns added by the user (whether manually by the user or automatically through things like query table refresh) are always included in the table unless the user decides not to. Accordingly, there is no determination to be made, since the user has identified the range.
David confirmed in a response under a different topic that INDEX(Table,ROWS(Table),COLUMNS(Table)) would resolve to the bottom-right cell in the table named Table. That so, ROWS(Table[#All]) should presumably return the height of the table.
Another question. If table size can change automatically upon, e.g., query table refresh, would pivot tables fed from automatically refreshed tables also refresh automatically or will user inteaction still be needed?
Anonymous - Backwards compat of features is a topic that will require a few posts to talk through properly.
Harlan - the PivotTable does not auto-refresh, so user interaction will still be needed. Curious - would you like that sort of "auto" refresh behaviour?
| . . . the PivotTable does not auto-refresh, so
|user interaction will still be needed. Curious -
|would you like that sort of "auto" refresh
Automatic recalculation is the key distinguishing feature of spreadsheets vs functional programming languages and databases. Anything in spreadsheets that fails to provide automatic recalculation is generally BAD. That includes pivot tables.
I understand why it'd be a bad idea refreshing pivot tables from outside sources, like databases, but on modern hardware refreshing pivot tables based on worksheet ranges shouldn't much worse than recalculating complex formulas that return the same results. So, yes, automaticly refreshing pivot tables based on worksheet ranges would be a GOOD thing.
I agree with Harlan on the automatic refresh, with a sidenote:
Why not include an option in the Pivot table options: "Refresh Pivot on recalc".
"Rob – Yes, data validation is copied down (any ‘property’ you can set on a cell is). No, data validation hasn’t changed. What specifically are you looking for? "
Not trying to answer for Rob. I do have a few requests on validation:
1. Make "List" reference not just on current sheet (I'm still using v2000, not sure about subsequent versions)
2. Add a "Like" option, then a box to specify the format "YR*" or "###-##-####", or a rule dialog box to specify user input must contain "@", "http://" etc.
3. Provide another tab on "auto-scrubbing" feature, such as auto converting to upper case, formatting to a specified format (SSN, phone number etc based on numbers only input)
Since Excel now hosts 1mm rows, I can see Excel more and more used as a pseudo database (DBAs will freak out but I can tell you that will be the realty), more robust data validation is very important.
|1. Make "List" reference not just on current
|sheet (I'm still using v2000, not sure about
Define a name for the list, and use the defined name as the source for the drop-down. This works in XL10 (2002) and (I believe) previous versions that provide Data Validation.
This seems to prove that there's no semantic reason why data validation drop down lists can't use ranges anywhere in any open workbook. (Just like there appears to be no semantic reason for the 7 nested function call limit in current versions.) It looks like another shortsighted syntactic restriction serving no clear purpose.
|2. Add a "Like" option, then a box to specify
|the format "YR*" or "###-##-####", or a rule
|dialog box to specify user input must contain
|"@", "http://" etc.
And if Excel would ever provide a LIKE worksheet function (or regular expression support in SEARCH, as OpenOffice optionally provides in its SEARCH function in Calc), then it'd be possible to write data validation formulas using such new functions that would be able to check almost any text pattern that could be defined in prose.
|3. Provide another tab on "auto-scrubbing"
|feature, such as auto converting to upper
|case, formatting to a specified format (SSN,
|phone number etc based on numbers only input)
Adding ever more complexity in the dialogs. Formula (that nasty word!) filters could be added to convert entries (other than formulas) into some specified format. E.g., converting entries to upper case,
converting to standard US SSN format,
The former should be simple enough for most users to figure out.
Data entry masks that filtered entries and put them into the given format would be OK too, but not as powerful.
- List reference names does not work for XL2000. Good to know it works for XL2002. So I guess it will carry through 12.
- LIKE function will be highly useful. My examples are only a small faction of the wish list I have accumulated
- The auto-scrubbing function I suggested is not to limit users but rather seemlessly convert user inputs without giving annoying error messages all the time. Think of an application using a designed sheet as input source for high-school level temps. They need to input things quickly, not messing with any formula. Also again, my examples is only a small fraction of the wish list. How about auto-triming spaces? auto-add fixed strings to input (and into any position of the input)? auto-deleting any characters? auto-set hi-lo limits for numerics? auto-calculation based on certain formula (a/s/m/d, sqrt, or whatever)? In total, I hope data validation should be more of an "enabler" than just a "disabler".
For Data Validation that uses criteria from a List, I'd like to see an autocomplete feature.
The feature doesn't even need to be a part of Data Validation - it just might be a convenient place to put it.
The standard in-cell autocomplete (Options: Enable AutoComplete for cell values) doesn't always activate so that's why I'd like an autocomplete on Data Validations.
On Data Validation:
I'd like the width and height of the dropdown box to be configurable.
Since we're opening up data validation here, the one modification I'd like to see is logic included to prevent data validation from being completely defeated by pasting data into cells with data validation rules.
Now back to your scheduled program....
|Since we're opening up data validation here,
|the one modification I'd like to see is logic
|included to prevent data validation from
|being completely defeated by pasting data
|into cells with data validation rules.
Let's take that a bit further. Data Validation would be a LOT more useful if it were part of the recalculation infrastructure. Users can also enter formulas which initially return valid values but later return invalid values. It'd be very useful if Excel would return a new error value, #INVALID!, in such situations. Even better if #INVALID! errors couldn't be trapped by IFERROR.
[Which gets back to IFERROR and selectively trapping error values. IMO, there are three classes of error: the expected errors #N/A, #VALUE!, #NUM!, #DIV/0!, the ambiguous error #REF! and the unexpected errors #NULL! and #NAME?. The first set are often returned by built-in functions in fairly unexceptional circumstances (lookup value not found, substring not found in longer string, square root of a negative number, average of a range containing no numbers). #REF! could be returned by INDEX, INDIRECT or OFFSET passed invalid parameters (begging the question whether #VALUE! would have been more appropriate), but could also indicate model corruption (destroyed references). The last set almost always indicate model corruption or incompleteness. I can't see any good reason ever to trap #NULL! or #NAME? errors.]
Back to the issue of pasting, it does seem odd that Data Validation isn't triggered by all the actions that would trigger the Change event handler.
I suspect that back in XL8 (97) someone foresaw that pasting in large amounts of data into multiple cells at once would require a design decision about whether to handle Data validation rules separately in all affected cells (which could take a LONG TIME if lots of cells were affected), prompt to paste without applying rules or cancel pasting (something like issuing a warning and prompt like 'The destination range contains cells with Data Validation rules. Those rules can't be processed all at the same time. Proceed without applying those rules or abort the operation?' but that was before Microsoft came to believe that the user should be in control), or just ignore the rules in silence. So guess which one Microsoft chose?
Myself, being a cynic, I note that if cell A2 has custom validation rule =$A$2