Performance Improvements in Excel 2010

Thanks to Chad Rothschiller for putting together these posts on performance.

In this release of Excel we spent dedicated research and development resources on improving performance. In addition to the usual optimizations and tuning efforts we make post “code complete”, we spent time during the development milestones researching, designing, and implementing performance features beyond algorithmic tuning. For this release, these are the areas where we focused our performance improvement efforts:

  • Customer Feedback
  • Large Data Set scenarios
  • Calculation
  • Multi-Core Opportunities

As you’ve heard us say before, the Excel team takes performance very seriously. We know that our customers build intensive solutions that help businesses make key decisions, and performance of Excel and those solutions is paramount. In this 2-part series I’m going to talk about the challenges and opportunities we embraced when planning our work for Excel 2010 performance.

In this post I’m going to talk about the first one: what we’ve been doing to address customer feedback around the performance of Excel 2007.

Improving Performance Based on Customer Feedback

Office 2007 was a big release for us, introducing some major architectural investments that resulted in significant new capabilities. We scaled up the size of the grid (now 1,048,576 rows by 16,384 columns), increased a bunch of limits across the product, introduced a shiny new graphics platform and invested in upgrading our charting abilities, implemented a new user interface, greatly expanded the concept and usage of styles, and shipped Excel Services, which shares source code with Excel client in order to ensure fidelity. We knew that performance of the new architecture was very important, so we spent a lot of time working on performance in Excel 2007. When we released Excel 2007 we had addressed the key performance issues we found during development, but we also knew that as customers started using Excel 2007, we would get additional feedback around performance. And we did.

A major initiative for us this release was to seek out that feedback, get engaged with the issues, and work to tackle them. Our goal was to seriously and significantly improve performance, but we knew that was going to be hard, given the big architectural changes that came with the Excel 2007 release. However, we dug in hard, got to work, and made good progress against our goals.

We gathered up as many customer examples as we could get our hands on (you might recall a blog post I made in January of this year to that effect) to identify the problems that people were seeing. I also searched the web for people making comments about Excel 2007 performance and sought out their examples and feedback. Thanks again to all of you who provided and continue to provide us with your workbooks. All of your files have been collected, run, analyzed, re-run (several times over) helping us to prioritize and fix the right set of problems.

In the “customer feedback” category, here are the key areas where we made performance investments for Excel 2010:

Printer & Page Layout View

Page Layout View relies on printer-specific information and calculates a lot more layout information in order to render the view as the printer would. Unfortunately, Excel 2007 spent lots of time talking to the printer, re-talking to the printer, and calculating these metrics. As a result, basic tasks like entering data in the sheet could potentially be noticeably slower.

If the default printer happened to be offline, Excel 2007 ended up waiting for a very long printer timeout before becoming responsive again.

To address these issues, Excel 2010 caches the printer settings and we’ve further optimized the render calculations. We’ve also made connecting to the printer cancelable so that you as the user remain in control.

Shapes

We saw a good number of performance issues with Excel 2007 related to having lots of shapes on a worksheet (often they’re hidden shapes, the result of copy/pasting from a web page). The Office graphics team has taken a good look at those scenarios and done a lot of work to speed up performance in this area.

Additionally we took a look at Shape object model “*.select” performance and made good improvements there. The chart below shows how long it takes Excel 2007 vs. a beta version of Excel 2010 to run the following code, updating 300 text boxes with the text “chad” (call me a narcissist):

For i = 1 To 300
   ActiveSheet.Shapes(i).Select
   Selection.Characters.Text = "chad"
Next i

image

As you can see, we’re seeing huge gains in this area, compared to Excel 2007.

Charting

Ben Rampson outlines the significant charting performance improvements in a recent Excel blog post.

VBA Solutions

Another area of customer feedback was around performance of Excel 2007 when driven by VBA object model. Some of the ways that the OM was driving Excel exposed performance issues that simply weren’t noticeable when just booting up and interacting with Excel. For example, typing values in a cell felt responsive and fast. But running VBA code to update 100K cells with new values (one at a time) showed that Excel 2007 had significantly slowed down.

After receiving a good number of examples from you folks, we were able to run, analyze, fix, re-run and continue iterative performance fixing in this area. Here’s a summary of our gains:

image

This graphic shows that nearly 70% of the workbooks in our VBA solutions test suite (the workbooks you all sent us) will see significant performance gains when run using Excel 2010, with about 30% of those solutions at least cutting the Excel 2007 times in half, if not more (some of those solutions clock in around 5% of the Excel 2007 time).

Closing Excel

Another, slightly smaller performance issue with Excel 2007 is the amount of time it takes to close down and free memory up to work with other programs, especially closing down after working with really large workbooks. With Excel 2010 we implemented a fast shutdown mechanism, and I hope you enjoy the snappiness.

In the next post I’ll talk about performance improvements we made around Large Data Set scenarios, Calculation, and Multi-Core investments.

Office Blogs Comments

Comments: (22) Collapse

  • Hi

    Interesting stuff. It would also be good to hear about the improvements in accessing Analysis Services data that was alluded to in the Gemini video on Channel 9.

    Charlie

  • Hi,

    Recently I've developed a workbook in 2003 that creates up to 100 shapes along with 110 series in a chart via VBA.  This workbook's time to update performance in 2007 is terrible, 25 seconds to update versus less than one in 2003.  I can turn the option to display the shapes off, and performance improves in 2007.  I didn't have this available to submit in January for your testing, would you still care to receive a copy of it to test?

    Jerry

  • It's really good to hear performance is taken seriously for Excel 2010. Performance in Excel 2007 is so bad that we are telling our customers to stick to Excel 2003. Simply saving a workbook takes more than 10 times as long as it did in Excel 2003. I hope Excel 2010 will make us regain confidence in Excel, something we have lost with Excel 2007. Performance is not the only issue, stability is a big issue as well. We use VBA to access the Excel features, could be that stability was not as good using VBA. Anyway, I don't know how many crash reports I have been submitting to Microsoft due to Excel 2007 crashing.

    @Jerry Betz:

    Do you use the Excel Application ScreenUpdating option to disable Excel screen updates while adding your shapes? It makes a huge difference in performance when you temporarily set it to false.

  • Chad,

    Is the VBA performance comparison comparing with Excel 2007 SP2 or earlier?

    (Excel 2007 SP2 speeds up VBA significantly)

  • Charlie: have a look at the Gemini team blog which just went up recently.  http://blogs.msdn.com/gemini/

  • YES! I do appreciate and read your blogs, mail and posts about EXCEL! I do not have always time to respond, but I thank you for your detailed and precise examinations of your free RSS feeds.

    I live in Amsterdam and fly back and forth, never-the-less I use EXCEL extensively; I still have the original EXCEL Microsoft tome (ca 1998) which is still a great help in hunting down set-ups.

    AJU, as the say here in Heineken land...

  • Charles: Yes, the test was done with Excel 2007 SP2

    Jerry: Yes, we're still interested in looking at your workbooks. Just send to xlfiles@microsoft.com and we'll be in touch. Thanks!

    Jean-Marc: if you have a workbook that consistently crashes, feel free to send it to us so we can take a look (xlfiles@microsoft.com).

    --Chad

  • Good work folks

    Did you do anything to address file opening times across a network?

    This post seems to address many of the issues I was dealing with in a recent 2007 implementation.

    I'm looking forward to the next post on multi core/HPC.

  • Unfortunately, a feature that was removed from the Shapes model is the use of the parameter UserInterfaceOnly:

    Since 2007 SP2, whether or not you protect a sheet with UserInterfaceOnly=True, you still have to unprotect the sheet to change the text of a shape (or some other properties of a Shape).

    Good job on the speed improvement though.

    Sebastien

  • Fix bugs - don't add features

    I'd like to see the EXCEL 2010 team deal with the long standing calculation idiocy of treating unary minus with a higher order of precedence than an ordinary minus sign.  Although i have trained myself to prepend a zero in front of every minus sign following a delimiting character, i am sure this is by no means something everyone even knows about.

    In excel --> 0-x^2 (is not equal to) -x^2 and it should be!!!

  • Hi

    Good to see that this page from me and Jim Rech was useful to make this better in 2010.

    www.rondebruin.nl/shape.htm

  • 1. Please increase the itteration limit to something much larger than 32767.  I occasionally build spreadsheets which require many hundreds of thousands of itterations to solve.  It's ridiculous to have to babysit my computer for 20 hours just to make sure it is always busy.

    2. It seems that for the most part, excel always calculates as a cascade from the top of the sheet to the bottom.  You should be able to reverse this or make it go from left to right, etc.  Furthermore, you should be able to tell excel in what order to calculate arbitrary regions of the spreadsheet.  

    3. Excel always uses the newest calculated value to solve a formula.  There should be an option where excel always uses the values from the previous itteration to calculate new values.  This is critical for solving problems which are not steady-state.

    4. Implement charting of streamlines and tensorial properties (charting vectors over space).  The conditional formatting feature does this effectively for scalars, but not vectors.

    5. Make external referencing run faster (external references of large worksheets at very high row numbers is phenomenally slow).

    6. Implement "dynamic" referencing.  That is, allow the results of a calculation dictate the location of a cell reference.

    7. Implement a reverse/mirrored referencing option when copying cells--that is, when I copy-drag a cell left, the cell reference location converges right instead of left and similarly for vertical copy-dragging.

    8. Allow more color gradients for conditional formatting.  Right now you can only use 3 colors.

    9. Integration.  Integration in excel isn't directly possible without coding or add-on packages.  For instance, the error function can be used with the analytical tools (?) add-in.  I would like to integrate formulas which are defined within the cell for some defined limit for one of the values.

    10. Better control of graphing multiple series--if the suite of data series are not consistent with each other and with the y axis values you cannot simply click on the chart and see the source of the data highlighted.  This is extremely annoying.

    11. Better control of graph marker/line formatting options.  It shouldn't require 2 hours of work to make each line to be formatted as colors on a gradient and make them all dotted or have the dots increase in size for different data series, etc.  Instead, this should be as easy as the conditional formatting features.

    This is all I can think of right now.  Hopefully excel developers will see this post.  Most of these should be relatively simple to implement.

  • Could you show speed comparisons with 2003 rather than 2007

    It does not make sense to compare a limited beta version(2010) with a public beta version(2007)

  • @Chris Grose -

    Can't you use a macro that calls Application.Calculate multiple times? OTOH, if you have lots of circular calculations, you'd likely be much better off using simulation software that can work with transfer functions.

    Actually, given most of your other points, it sure seems you'd be better off using MatLAB,  one of the FOSS work-alikes (GNU Octave or SciLab), or APL2.

  • @Harlan Grove

    The fact of the matter is that many of these changes are relatively simple and will expand the capabilities of excel massively.  You can bypass much of this by using matlab, as you say, but this would require significant knowledge of coding whereas Excel does not.

1 2  Next >
Comments

Comments: (loading) Collapse