You can use your favorite social network to register or link an existing account:
Or use your email address to register without a social network:
Sign in with these social networks:
Or enter your username and password
Forgot your password?
Yes, please link my existing account with for quick, secure access.
No, I would like to create a new account with my profile information.
Tips
How-to
News
Videos
Stories
One of our goals with tables was to create a set of features that reduce the overall maintenance required to keep a spreadsheet functioning well over time. This involves making spreadsheets less prone to error, as well as making them more understandable days, months, and years after the spreadsheet was created. Rethinking the interaction between tables and formulas proved to be an important part of meeting that goal.
As many readers have presaged, Excel 12 provides some new ways to reference tables and parts of tables. We refer to our work in this area as “structured referencing” (that is a working title, so it may be called something else when we ship the product). In a nutshell, the structured referencing feature allows you to reference a table and/or subsets of the table directly by name as opposed to by cell coordinates. The feature is similar in concept to named ranges with a few crucial differences. First, the names that can be referenced are automatically generated when the table is created. Specifically, this includes the name of the table itself (which by default is something like “Table1”), and the names of all the columns. Also, the names are automatically removed as columns are deleted or the entire table is deleted. Finally, and perhaps most importantly, the names automatically adjust as the table grows and shrinks. As a result, the majority of the headaches of maintaining named ranges go away with structured referencing.
So how is all this manifested in the product? Structured referencing represents an addition to the syntax for formulas in Excel. Here are the basics of how it works.
(Click to enlarge)
There’s more to the syntax than that, but first I want to talk about something I personally love – integration between structured referencing and Formula AutoComplete. The Formula AutoComplete feature I talked about a week ago is fully integrated with structured references – meaning Formula AutoComplete for tables names as well as columns within tables is possible. For example, here is what it would look like to build the structured reference pictured above. First, let’s start with just a table.
Next, I type “=SUM(S”. You will note this lists everything starting with “S” in the Formula AutoComplete drop-down, including formulas, named ranges, and my table, “Sales”.
After I select my table “Sales”, I can now see all the columns in the table.
And with a single click or keystroke, Excel 12 completes the reference.
All I need to do is close the parenthesis and press enter and I am done. We believe this will improve accuracy and efficiency.
Structured references can be used inside tables as well, and here Excel has even more new behaviors. Say I’m looking at some sales figures and I want to calculate the percentage change in sales over the last few years for each row in my table. My table looks something like the screenshot below.
The formula I want is something like: =(FY05-FY02)/FY05. The first thing I want to do is add a column, so I start typing “Percent Growth” after the last column in my table.
When I press enter to commit the change, Excel adds another column to my table – the appropriate formatting is applied to the rows, the border around the table adjusts, etc. So far so good.
The next step is to write the formula. I might do something like type "=(" and then use the keyboard to move selection over to the cell I want to calculate – in this case FY05. Excel gives me the following reference: [FY05].
As you can see, when referencing a table from within the table itself it is not necessary to prefix the table name. As I use other columns in my formula, they get similar references.
When I am finished with my formula and press ENTER, Excel automatically fills that formula down for all rows in the Percent Growth column.
This is another new feature of tables called calculated columns. Any time a formula is entered into an empty table column it will automatically fill. We think this will help reduce errors introduced by manual filling or copy/paste. Not only does it fill, but it continues to fill down as you add or delete rows in the table. This is another one of those “sticky” properties I mentioned in my previous post. As with other table features, the user does have control over its behavior – at the time you enter a formula in a cell in a calculated column, the auto-fill can be undone using on-object UI (which you can see in the previous screen shot). Further, if you really do not want columns filling down ever, the feature itself can be turned off.
If the calculated column ever needs to be updated, it is only necessary to edit one copy of the formula and the change will propagate to all rows. In addition, it is possible to change any single row (e.g. enter a static value or custom formula) so that it is not consistent with the calculated column. Such cells will be flagged visually so that inconsistencies will be easy to spot. For example, if I enter =RAND() in the middle of my column and choose not to fill down, I see a green triangle in the upper left-hand corner of the cell.
Furthermore, once a calculated column contains inconsistencies, subsequent edits to cells in the calculated column do not propagate because Excel does not want to overwrite custom values. However, the same UI shown above will appear allowing the user to opt-in to the behavior.
The last feature I want to talk about is the table totals row. (For those that use Excel 2003 Lists, you will recognize the totals row, but we have made some nifty improvements since 2003, so please read on.) The totals row is another special area of the table, like the header row. It lives at the bottom of the table and its purpose is to calculate totals for the columns in a table.
The total row can be enabled via the table tab in the ribbon, right-click on the table, or by using the existing AutoSum functionality. The nice thing about having a special area for totals in the table is that it participates in the table’s activities. It grows and shrinks properly with the table, it gets special “total row” formatting when you apply a style to the table (more on table styles in a later post), and the formatting behaves appropriately as more columns are added to the table.
The total row cells can contain pretty much any kind of formula (the formulas don’t have to reference the table at all but we don’t expect that to be the common case) as well as text labels. The cells in the totals row also contain a dropdown that shows you some of the most commonly used functions, lowering the bar for the new Excel user to write formulas.
The total row can be toggled on and off, and any functions that exist in the total row will be “remembered” until the next time you turn it back on.
So now let’s finish our discussion of syntax, since some of you are probably asking questions like “how do I reference the entire table, headers and all?” Referencing the entire table can be done this way: =Table1[#All]. Or if you just want to reference the headers: =Table1[#Headers]. To reference the entire column: =Table1[[#All], [Column1]]. To reference just the header value of a column: =Table1[[#Headers], [Column1]]. The special keywords can also be combined: =Table1[[#Headers],[#Data],[Column1]]. At this point I think you get the idea.
Some other points about structured references: table names are globally unique to a workbook and do not require a sheet reference like Sheet1!Table1. Table column names must be unique within the table itself. Table names co-exist in the same namespace as named ranges, meaning you can’t have a defined name with the same name as a table, and vice-versa. If the table name or any of the column names change then any formulas that reference those names will automatically update as well. Structured references can be created using selection with the mouse (and yes, there is an option to turn this off). In fact, structured selection (I talked about it in my last post) is one way to very quickly generate structured references when writing formulas.
That’s it for now. For my next post I will delve into the work we’ve done around the area of autofilter. Multi-select anyone?
Comments: (32) Collapse
David,
I would likely upgrade for this feature alone.
A big "well-done" to the Excel team!
A few questions:
1. When a new row is appended to the Table and formulas/formatting is copied down, is data validation also copied down?
On that topic:
1a. Has the Data Validation dropdown had any work in Excel 12?
2. Is it possible to have the column reference dynamic?
By that I mean to sum Table1's 2004 data and A1 contains 4
=SUM(Table1[2000 + A1])
Is that possible?
3. Is it possible to detect (by formula) whether a Table's formula column is in a non-consistent state? eg. Someone put =RAND() in the middle cell! :)
I'd allow overwriting formulas but in some situations a non-consistent formula column is dangerous - usually when a user forgets to replace the original formula once their "exceptional problem" is dealt with.
I'd like the option of more highlighting (eg. a VERY BIG Red cell) than just the little green triangle.
This sounds great! This WILL reduce the user support required for inconsistent formulas, and newbie’s trying to use formulas with tabular data. In short it should reduce the time I spend helping Excel users by about two thirds, so I can spend more of my time on more productive tasks. It means that a newbie will be able to do (with dynamic ranges) what it currently takes an expert to do with defined names and offset formulas and so on.
Questions.
Where is the table name specified? In screenshot one the table name is Sales, but the name does not appear on the canvas, and the screenshot of the create table dialog from you blog 25 Oct. does not have a place to specify the table name. Oh wait, I’ve spotted it in the Table Tools/Properties Chunk in the new UI (5th screenshot in your 25 Oct. post on Tables Part 1). Could it also be included in the create table dialog?
Can the table name be referenced by a formula? For example in the first screenshot the name of the table is Sales but it is not exposed on the canvas. Could I put a formula in say, cell B3 something like (using .NET paradigm) =Sales.toString, or = Sales.name or Sales[name] or some such and have the name of the table appear. This would help with table names on the canvas actually matching the tables 'structured reference' name and staying that way if it is changed.
Is there a way in structured reference formulas to control whether the formula will return the complete sum of a column or only the sum of the visible rows when the table is filtered? (I have read all of your posts but not every comment from every post so this may have already come up). If so then a conditional format based on a formula that compared the cell value to the complete column sum could be used to highlight values that are/aren't the full set.
In rows other than the summary row will formulas auto fill the row like they do for columns? I’m thinking of subtotals here (but not particularly grouped by some value in a given column, so that the automatic subtotals feature can’t be used). For example a balance sheet, where columns A contains the line items for the sheet and you want subtotals for the Fixed assets, Current assets, Total Assets but you cannot subtotal by Column A ‘cause the Fixed assets are line items like ‘Land’, ‘Buildings’, ‘Plant’ etc. (and using a hidden column with grouping values is a kludge!)
Finally, not related to the current topic, is there a gallery table format that alternates row shading by 2, 3, 4 or 5 rows at a time. Or will this still require some (easy to be sure) custom conditional formatting? Alternate row shading is great on small table, but once the table goes over about 12 rows, blocking the alternate shading by groups of 3, 4 or 5 rows improves readability, especially when the report/table is printed. We use this type of shading very frequently at work (probably by 3’s the most, but I’m partial to 4’s for larger data sets!!!).
Thanks for the great post and screenshots, but even more thanks for the excellent work the Excel dev team is doing. I can’t wait for 12!
Mmmm… multiselect autofilter mmmmm!
Hazz
Business Analyst.
that looks like the biggest single advance in Excel in the last 10 years.
Cant wait to get my hands on it to try out all the wrinkles.
Have you made any extensions to row referencing?
Charles
Awesome! This is really great work.
I second Charles' question: what about row referecing?
Jean-Marc
Never sure where to post off topic ideas. Forgive me
With the dramatic increase in columns and rows, and now proper cell colouring rather than just index colours, I could imagine some quite beautiful pictures appearing if you can zoom out far enough (ie really small percentages).
So who will be the first person to write a macro to read jpegs in with each cell representing 1 pixel.
A few questions? Is Excel in charge of the List portion of SharePoint? If not, who can I direct my questions on SharePoint to?
Now a few points:
1) I am loving all the new advancements, but it still beings up my question about the differences between Excel and Access, though I have no particular need to see them kept separate.
2) Can we get some subtle changes to VLookup or a new function that will work with tables more efficiently? Like:
=VLOOKUP("My Value", Table1, Column1, 0)
Currently, where Column1 is you have to put an index, but that's hardly fun. And if you really want to impress, make it so "My Value" is a simple query.
3) And, lastly, if you do introduce datatypes don't introduce things like 64-bit int or float versus double. Don't make people think about stuff like that since we know Excel can handle that all by itself.
Again what would be cool is when I type in a date in a table column, it would make that cell a date picker and copy that datatype to all the lower cells, making them date pickers too. You would of course have to add a smart tag, or whatever you're calling them these days, for undoing the action.
Just had another thought after re-reading row referencing.
What would be better than explicit row referencing in my mind is row referencing based on criteria. That's what people are really looking for anyway. Something that would return the highest profitability. VLookup for rows essentially.
What would a running total or any other formula that looks at other rows look like in this system? I take it, it would still use cell references.
Right on David. Table concept is going to help everybody using Excel.
|When I am finished with my formula and press
|ENTER, Excel automatically fills that formula
|down for all rows
That is exactly what I wanted when I was speaking about "Header Formula".
Will it be possible to have more than one Table totals row ? This is practical if you want Total and Average and ... at the same time.
|More Functions... (for the Table totals row)
Will these include Median, Percentile,...
I am also interesting to have more details on Orion point 2), especially is "My Value" argument. Right now, if I have more than one column as key fields of my "table" (let saw [Product] and [Region]), I typically create a calculated column ([Link] = [Product]&[Region]). When I am referencing to get a value from that "table", I use [Link] to find the right row. I would not have to do all this if "My Value" could be define as ([Product]="ProductValue", [Region]="RegionValue").
Looking forward to read on autofilter,
Jean
This should be useful.
Finally a new bit of formula syntax. The example showing how to sum a column is helpful. How do you sum a row? Given the sample table in the screen images, how would one sum sales in all years for each product? Presumably
=[2001]+[2002]+[2003]+[2004]+[2005]
would work, but hard to see what would happen when the user adds 2006 as a new column in the table.
My point here is that SQL.REQUEST run against a dynamic named range as a database table would provide the same functionality as XL12 tables with respect to formulas, though perhaps not as simply for most users.
Now I'll give you the benefit of the doubt and anticipate that your next posting will show us that you've included intelligent multiple row and multiple column referencing syntax like
=SUM([2001]:[2005]) [inside the table]
=SUM(Sales[2001]:Sales[2005]) [outside the table]
to give the same result as the previous formula. If table references resolve to range references, this should be the case since : is an operator on ranges that returns the smallest single area range containing all its operand ranges (e.g., (A1:B8):(C6:F6) resolves to A1:F8). So, will table references resolve to range references?
Next, additions to the name space: #All, #Data, #Headers. High time. If only there were new #Missing or #Blank tokens to represent the same value as blank cells (which appears to be the same value as VBA's variant value Empty), but they'd require that Excel propogate such values. Anyway, if [#All] gives the same result as [#Headers],[#Data], what's the point of allowing [#Headers],[#Data] if these are the only new keywords since [#All] plus either or both of the other two would be pointless. Which begs the question whether there are other new #-tokens.
One operation outside tables that would be natural against tables is VLOOKUP. Given the sample table, would
=VLOOKUP("Product10",Sales,[2005],FALSE)
or
=VLOOKUP("Product10",Sales,Sales[2005],FALSE)
or even
=VLOOKUP("Product10",Sales,COLUMN(Sales[2005]),FALSE)
do the right thing? If not, then it's time to add a 4th argument to LOOKUP similar to the 4th argument to VLOOKUP so that
=LOOKUP("Product10",Sales[Product],Sales[2005],FALSE)
returns the expected result.
With regard to COLUMN(Sales[2005]), would that return the column number relative to the table or relative to the worksheet? Actually, if Sales[2005] returns a range reference, then it'd seem it must return the column index relative to the worksheet.
If table references return range references, will INDIRECT be able to generate dynamic table references? For example, if the defined name YR referred to 2004, would
INDIRECT("Sales["&YR&"]")
be interchangeable with
Sales[2004]
(net of function nesting and formula length limits)?
Jean Martineau...
...
| . . . Right now, if I have more than one
|column as key fields of my "table" (let saw
|[Product] and [Region]), I typically create a
|calculated column ([Link]=[Product]&[Region]).
|When I am referencing to get a value from that
|"table", I use [Link] to find the right row. I
|would not have to do all this if "My Value"
|could be define as ([Product]="ProductValue",
|[Region]="RegionValue").
Which would get us most of the way towards Lotus 123's @DSUM's criteria expressions. Note that if all you're doing is summing, counting or averaging fields in tables with multiple criteria, the new SUMIFS, COUNTIFS and AVERAGEIFs functions will handle this, so
=SUMIFS(Sales[2005],Sales[Product],Prod,Sales[Region],Reg)
should return total sales in 2005 for product Prod in region Reg. But that begs the question how to produce the same conditional sum for the range of years 2001 through 2005. The formula
=SUMIFS(Sales[2001],Sales[Product],Prod,Sales[Region],Reg)
+SUMIFS(Sales[2002],Sales[Product],Prod,Sales[Region],Reg)
+SUMIFS(Sales[2003],Sales[Product],Prod,Sales[Region],Reg)
+SUMIFS(Sales[2004],Sales[Product],Prod,Sales[Region],Reg)
+SUMIFS(Sales[2005],Sales[Product],Prod,Sales[Region],Reg)
is one way, which hopefully could be condensed to
=SUMPRODUCT(SUMIFS(INDIRECT("Sales["&{2001,2002,2003,2004,2005}&"]"),Sales[Product],Prod,Sales[Region],Reg))
but it'd be very nice indeed if there were new useful new syntax like
=SUMIFS(Sales[2001]:Sales[2005],Sales[Product],Prod,Sales[Region],Reg)
This leads to obvious questions about interaction between tables and pivot tables: can tables feed pivot tables, and can pivot tables be defined as tables (presumably under the restriction that there be only one header row)? Then data in pivot tables could be referenced like XL12 tables.
Will tables work in external references into closed files? That is, if C:\foo\bar.xls contains MyTable which in turn contains the column MyField, will
=SUM('C:\foo\bar.xls'!MyTable[MyField])
return the expected result?
Further, dynamic referencing into closed workbooks is, to put it as charitably as possible, a PITA in Excel. I came up with a udf that uses another Excel application instance that calls ExecuteExcel4Macro with a derived external reference to return values from closed workbooks, but I suspect y'all ain't gonna rewrite ExecuteExcel4Macro to accomodate XL12 table references (believe me, I'd **VERY** **MUCH** like to be wrong about this).
So if the formula above does work, I'd need to rewrite my udf to do the following in the other Excel application instance: open a new worksheet, enter the derived external reference formula into a range, return that range's resulting .Value property to the calling formula. I'd also need to run the isolated pathname through VBA's Dir function to check whether or not it exists. If the file exists but the table doesn't, would
rngref.FormulaArray("='d:\foo\bar.xls'!MyTable[Myfield]")
return a useful #REF! error, or would it trigger a runtime error because XL12 wouldn't accept the formula? That begs the question how XL12 works interactively when tokens resembling table references are used in formulas, but there's either no such table or no such field within the table matching the reference in the formula. #REF! error or syntactically invalid?
My point was for returning a single VALUE, but Harlan's point
|=SUMIFS(Sales[2001]:Sales[2005],...
is also interesting
Thank you for these new features.
Will I be able to import the TABLE data to a Pivot Table?
is there a #Totals keyword, or how do you reference the Totals row from outside the table?
Comments: (loading) Collapse