Function Improvements in Excel 2010

Thanks to Jessica Liu for putting together the next few posts on function improvements.

In Excel 2010, we made many improvements to Excel's function library. Excel 2010 will feature an accurate and consistent function library while remaining compatible with previous versions of Excel. In this first blog post, I will be giving an overview of the work we did in this area as well as talk about the function accuracy improvements. Subsequent posts will go into the details of the consistency improvements as well as the backward compatibility story.

The first area we invested in was to improve the accuracy of functions. Over the years there have been various academic papers detailing issues in Excel's worksheet functions. In Excel 2003, we started the work to address the most serious of the issues reported in these papers and in Excel 2010 we have addressed even more of these issues. Our goal for Excel 2010 was to address the most significant function accuracy issues reported. For any function we modified, we corrected all known bugs relating to that function.

We implemented new algorithms in order to improve the accuracy of our statistical, financial and math functions. We worked very closely with industry experts to determine which algorithms to use as well as to validate these new algorithms. Our hope is that Excel 2010 users will be able to utilize functions in our library with confidence knowing that they have comparable accuracy to those of other statistical packages.

The other area we invested in was making our function library more consistent. This was in response to the other set of concerns voiced in these academic papers as well as by our users. Users have noted that there were consistency issues with Excel's function names and definitions. In Excel 2010, we will offer users a set of consistently and accurately named functions as well as function definitions that are consistent with user expectations. We have introduced over 50 new functions in order to do this.

Finally, the last piece of work we did in this area was to update the functions user interface. We have improved the function auto complete feature, and we have also made changes to support the new function set.

Improved Function Accuracy

For Excel 2010, we overhauled the function library and implemented completely new algorithms for many of our statistical, financial and math functions. The Excel team partnered with Frontline Systems, the Numerical Algorithms Group, and ScienceOps to select, implement and validate these algorithms.

The algorithms for calculating the follow statistical distribution functions have been modified or redesigned completely for better accuracy:

Binomial distribution

BINOMDIST, CRITBINOM

Chi squared distribution

CHIDIST, CHIINV

Exponential distribution

EXPONDIST

F distribution

FDIST, FINV

Gamma distribution

GAMMADIST, GAMMAINV

Hypergeometric distribution

HYPGEOMDIST

Lognormal distribution

LOGNORMDIST, LOGINV

Negative Binomial distribution

NEGBINOMDIST

Normal distribution

NORMDIST, NORMINV

Standard Normal distribution

NORMSDIST, NORMSINV

Poisson distribution

POISSON

Student's t distribution

TDIST,TINV

Weibull distribution

WEIBULL

The following financial functions have improved accuracy:

Cumulative interest paid on a loan

CUMIPMT

Cumulative principal paid on a loan

CUMPRINC

Interest payment for an investment

IPMT

Internal rate of return for a series of cash flows

IRR

Payment for a loan

PMT

Payment on principal for an investment

PPMT

The accuracy of these additional functions has been improved:

Hyperbolic arcsine

ASINH

Ceiling function

CEILING

Convert function

CONVERT

Error function

ERF

Complementary error function

ERFC

Floor function

FLOOR

Natural logarithm of the gamma function

GAMMALN

Geometric mean

GEOMEAN

MOD function

MOD

Random number function

RAND

Sample standard deviation

STDEVS

Sample variation

VARS

As part of the accuracy improvements, we will also accept a larger range of input values and as a result will be returning a wider range of results for certain functions. For example, the ERF and ERFC functions will now take in negative input values, and the MOD function will be able to take larger input values.

In the next post, I will talk about the changes we have made in Excel 2010 to improve the consistency of the function library.

 

try_office_today_418x80

Office Blogs Comments

Comments: (17) Collapse

  • Really please to see these improvements, and the whitepaper does add some great detail.

    http://wp.me/p2I5L-44

  • [Forgive me if this is another duplicate.  It will be my last try for now.]

    Has the following Excel 2003 problem been fixed in 2010 (or 2007)?

    INT(123456789 - 0.0000004) returns 123456789 instead of 123456788.

    This causes a problem in formulas like the following:  if A1 is =123456789-0.0000004, =A1-INT(A1) is negative unexpectedly, about -4.917E-07 when formatted as General.

    In contrast, INT(123456789 - 0.0000005) returns 123456788 as expected, and myInt(123456789 - 0.0000004) returns 123456788, where myInt() is the following UDF:

    Function myInt(x as Double) as Double

    myInt = Int(x)

    End Function

    Note that 123456789 - 0.0000004 is represented internally as about 123456788.999999,598, whereas 123456789 - 0.0000005 is about 123456788.999999,493 internally.  (The comma demarcates the first 15 significant digits.)

    So I suspect that the Excel INT algorithm is effectively, albeit perhaps unintentionally, rounding its argument to 15 significant digits before truncating to an integer.  It shouldn't.

    Indeed, the largest expression involving 123456789 that returns an incorrect INT value is 123456789 - 33*2^-26, which is represented internally as about 123456788.999999,508, whereas 123456789 - 34*2^-26 is about 123456789.999999,493 internally.

    As you might imagine, the problem is not limited to 123456789 and 0.0000004.  And the problem will not appear with some combinations that you might think are similar, e.g. 100000000 and 0.0000004.

    You need to look at the exact conversion of the internal binary representation -- that is, beyond the first 15 significant digits -- to determine whether or not to expect a problem.  I have VBA code that will help with that, if you need it.

Comments

Comments: (loading) Collapse