You can use your favorite social network to register or link an existing account:
Or use your email address to register without a social network:
Sign in with these social networks:
Or enter your username and password
Forgot your password?
Yes, please link my existing account with for quick, secure access.
No, I would like to create a new account with my profile information.
Today's author, Dany Hoter, a Product Planner on the Excel team, talks about some performance characteristics he recently discovered using various methods to manipulate large ranges using VBA.
You have a large range in Excel with data. Let's say it contains 100,000 rows and 50 columns for each row (Yes you are using Excel 2007 of course). So altogether you have 5,000,000 cells. Columns A to F have some alphanumeric data that you need analyze and based on the combination of values for each row you need to use the numeric values in G to H to do some calculations and store the results in columns I and J. You could place 200,000 formulas in I and J but you see that a spreadsheet with such a volume of formulas gets very sow and consumes huge amounts of memory.
You decide to try and solve it in a piece of VBA code. The question is how to implement such a task in the most efficient way?
How can you scan a range in Excel, read the values in some cells, and change some others?
Let's assume that the range you want to read starts at A1
The code looks something like this:
Dim DataRange as Range ' Could also be Dim DataRange as Object Dim Irow as Long Dim MaxRows as Long Dim Icol as Integer Dim MaxCols as Long Dim MyVar as Double Set DataRange=Range("A1").CurrentRegion MaxRows= Range("A1").CurrentRegion.Rows.Count MaxCols= Range("A1").CurrentRegion.Columns.Count For Irow=1 to MaxRows For icol=1 to MaxCols MyVar=DataRange(Irow,Icol) If MyVar > 0 then MyVar=MyVar*Myvar ' Change the value DataRange(Irow,Icol)=MyVar End If Next Icol Next Irow
Many VBA developers learned VBA techniques from macro recording.
When using relative reference the generated VBA code creates statements like:
ActiveCell.Offset(0, -1).Range("A1").Select
As a consequence many developers adopt this technique and use the ActiveCell or selection ranges to move from cell to cell in code and read or write the cell values. The code will look like this:
Dim Irow As Long Dim MaxRows As Long Dim Icol As Integer Dim MaxCols As Long Dim MyVar As Double Range("A1").Select MaxRows = Range("A1").CurrentRegion.Rows.Count MaxCols = Range("A1").CurrentRegion.Columns.Count For Irow = 1 To MaxRows For Icol = 1 To MaxCols MyVar = ActiveCell.Value If MyVar > 0 Then MyVar=MyVar*Myvar ' Change the value ActiveCell.Value = MyVar End If ActiveCell.Offset(0, 1).Select ' Move one column to the right Next Icol ActiveCell.Offset(1, -MaxCols).Select ' Move one rows down and back to first column Next Irow
This technique copies the values from all cells in the range into a variable in memory, manipulates the values inside this variable and if needed moves the values back to the range after manipulation.
Here is the code this time:
Dim DataRange As Variant Dim Irow As Long Dim MaxRows As Long Dim Icol As Integer Dim MaxCols As Long Dim MyVar As Double DataRange = Range("A1").CurrentRegion.Value ' Not using set MaxRows = Range("A1").CurrentRegion.Rows.Count MaxCols = Range("A1").CurrentRegion.Columns.Count For Irow = 1 To MaxRows For Icol = 1 To MaxCols MyVar = DataRange(Irow, Icol) If MyVar > 0 Then MyVar=MyVar*Myvar ' Change the value DataRange(Irow, Icol) = MyVar End If Next Icol Next Irow Range("A1").CurrentRegion = DataRange ' writes back the result to the range
Another difference is that this method is blazing fast compared to the two others.
I compared the three methods on relatively large ranges and here are the results:
Method
Operation
Cells/Sec
Variant
Read
1,225,490
Write
714,286
Read/Write
263,158
Range
250,000
1818
1,852
Offset
206
200
203
As you can see using a variant variable is much faster especially when changing cells. Even if the calculation can be done with Excel formulas, in some cases this method is the only one acceptable because using a very large number of formulas can become very slow.
Obviously the one method to avoid is moving the ActiveCell using Offset.
Comments: (6) Collapse
Great article Joseph, more like this here please.
Not sure I understand the timings though. How can read/write end up faster than just write in the last two? Overall though timing differences look largely as I would expect.
Neal
Good article indeed. One minor issue though: is there any point in declaring Icol as an integer?. Should be of type Long I think. In fact, its being argued anyway what the point is of using an Integer these days. Memory space is not limiting and there's no calculation speed difference in using Integers or Longs.
Cheers,
Rembo
Nice to see that other people jump into the same problem.
In a database Search, I do pretty complicated calculations on the values found in that database, line by line, and return the result to the database.
And according to your study I use exactly the wrong method: the offset method.
Depending on the PC it takes inbetween 7 and 30 minutes to do the 58000 lines, and that's too long for a What-If kind of study
I build a small testfile to discuss this. Is there a way to share this file, and discuss?
Hi Dany
Thanks for posting.
I think most devs would manage state if reading /writing a big block.
If you set
app.calc = false
app.screenupdating = false
app.enableevents = false
Then, whilst you still get the same overall result (variant is the fastest) the margin is much reduced
on read/write range is 5x variant, select is 10x slower.
If you read using an xll call you get down to 2x the variant, but you maintain closer integration with the grid.
I think you demonstrate a good use of the technique, sadly I have seen people recreating Excel functions in VBA to do all the calculations and avoid the calling overhead. At which point its hard to understand why they used Excel at all instead of .net + grid control.
cheers
Simon
Hi Dany,
Its faster for numeric values to use Range.Value2 rather than Range.Value when assigning to a Variant.
Also using .Value2 does not force an implicit conversion of cells formatted as currency to VBA currency data type (and dates to VBA date type). And the implicit conversion to currency can cause loss of accuracy.
Just a thought, perhaps this entry could go one step further in explaining why the Variant approach is so much faster? I do the exact same thing here:
mrexcel.com/.../showthread.php
And it's not by accident. What you're doing is coercing the Range of Values to an intrinsic data type, a Variant Array. Intrinsic data types are far lighter weight in terms of memory than sophisticated Objects, like Range Objects. They don't support the number of properties, etc... That a Range has to commit to memory.
Arrays are extremely fast to iterate through, even compared to even simpler (than a Range Collection) Objects like a Recordset.
It's an important additional concept in addition to the findings posted here.
Best,
Nate Oliver
Microsoft Excel MVP
Comments: (loading) Collapse