Offloading UDF’s to a Windows HPC Cluster

Thanks to Gabhan Berry for putting together this series on high performance computing.

In my previous post I discussed how we have enabled workbooks to be calculated on a Windows HPC cluster. In this post, I’ll describe the other piece of HPC integration we have developed in Excel 2010: offloading the calculation of UDFs to a Windows HPC cluster.


User defined functions are one of the most popular pieces of Excel extensibility. UDFs enable customers to create their own worksheet functions and use them in formulas just like they use Excel’s built-in functions. Over the years, customers have built a wide variety of UDFs to solve different problems.

Prior to Excel 2007, Excel always calculated UDFs in a single threaded manner. In 2007, we added the ability for UDFs to participate in multi-threaded recalculation (MTR). This meant that Excel was able to evaluate UDFs simultaneously on different threads.

Now in Excel 2010 we have further evolved the capabilities of UDFs. Excel is now able to delegate the calculation of particular UDFs to a compatible compute cluster. This means that instead of the UDF being calculated on the desktop PC, the UDF is actually calculated remotely on a compute cluster. When Excel delegates the evaluation of the UDF to the cluster it does so asynchronously. This means that Excel does not have to wait for the UDF to return a result before Excel continues calculating other parts of the workbook. When the UDF completes, a callback is made into Excel to let Excel know that the value is available. Excel will merge the value into the calculation tree when it is ready.

Because these UDFs are evaluated asynchronously, Excel can make multiple requests for UDF evaluation to the compute cluster within small periods of time. This is where the resources of the compute cluster come into play. In an earlier post I described the parallelization capabilities of compute clusters. These capabilities enable the compute cluster to evaluate multiple UDFs at the same time. This has the effect of lowering the overall time needed to perform multiple UDF evaluations which in turn can reduce the time needed to calculate a workbook containing long running UDFs.

One point I’d like to make is that offloading UDFs is not a silver bullet for slow workbooks or slow UDFs. There are multiple factors that affect whether offloading UDFs to a compute cluster will actually result in a workbook being calculated faster than it would have been had the UDFs been evaluated on the desktop. I won’t go into the details of that topic in this post. I just wanted to raise my hand and make you aware of this at the outset.

Connectors, Containers and XLLs

The first point to cover is that only UDFs implemented in an XLL can be offloaded to a cluster. UDFs implemented in VBA or in a COM automation add-in cannot be offloaded to the cluster. We’ll discuss how you make an XLL UDF cluster safe a little later in the post.

In order for Excel to evaluate a UDF on the cluster, a component called a cluster connector must be installed on the desktop PC running Excel. This component acts as the proxy between Excel and the cluster. Excel doesn’t actually understand how to perform the remote call to the cluster, all it understands is how to interface with a connector. It is the connector that knows how to perform the remote call to the cluster and how to pass the result back to Excel.

The UDF is evaluated on the cluster using another component, called a container. The container is responsible for loading the XLL, invoking the UDF and returning the result back to the connector on the desktop.

The XLL that implements the UDF must be installed on both the cluster and on the desktop PC. This is because Excel needs to load the UDF prototypes into memory and also because the user is able to configure whether cluster safe UDFs are calculated on the cluster or on the desktop (we’ll discuss this a little later in this post).


The connector and container are components that customers and/or ISVs will develop. The Windows HPC team will be shipping a connector and container for the upcoming release of Windows HPC.

Enabling UDF Offloading

The user can specify whether UDFs registered as cluster safe are actually calculated locally or offloaded to the cluster. In the Formulas section of the Advanced Excel Options window there is a new setting called Allow user-defined XLL functions to run on a compute cluster. This option is enabled only if there is at least one connector installed. The user can also select which connector Excel should use by selecting the connector’s name in the Cluster Type dropdown.


By turning on this option the user is telling Excel that the UDFs supported by the specified connector should be offloaded to the cluster rather than calculated locally.

Cluster Safe UDFs

As mentioned before, cluster safe UDFs must be implemented in an XLL. You cannot create a cluster safe UDF in VBA or in a COM automation add-in. Also, cluster safe UDFs cannot interact with Excel in any way except to return their value. This means that a UDF that (for example) queries workbook properties cannot execute on the cluster. All of the UDF’s input data must be passed as parameters to the UDF and all output data must be passed to Excel as the return value of the UDF.

We also need to be careful about how we store and use data (such as state) across multiple UDF invocations. Due to the nature of how the UDF will be evaluated on the cluster, any data stored as global variables inside the XLL will not be persisted across UDF invocations. A better option is to store state (if we must have state) in another manner, such as in a database.

To inform Excel that the UDF is cluster safe, we append a & character at the end of the type text parameter to the xlfRegister API. For example, if our cluster safe UDF returned an int and took an int as a parameter, the type text to register the UDF as cluster safe would be: “JJ&”.

Connector and Container

Together, the connector and the container implement the offloading mechanism. The connector is installed on the desktop computer and understands how to interface with Excel and with the particular cluster that it is connected to. Because the specifics of communicating with the cluster are implemented in the connector component and not inside Excel, Excel is able offload UDFs to any type of cluster so long as there is an appropriate connector and container for that type of cluster.

Whenever Excel needs to calculate a UDF that is marked as cluster safe (and the user has turned on the option to offload UDFs to the cluster), Excel needs to invoke certain functions on the connector to instruct the connector to perform the calculation on the cluster.

How the container is implemented is very much vendor specific. The container does not interface with Excel at all and is free to execute the UDFs in any way. However, because cluster safe UDFs must be also implemented in an XLL on the desktop (even if they are just stub implementations) it is convenient to make the container able to load and invoke UDFs in an XLL. That way the same XLL can be installed on the desktop and on the cluster. This is what the Windows HPC container does. But other cluster vendors may choose to build their container differently.

For the sake of brevity, I have skipped over the details of the connector API. I really just want to give you a high level understanding of how offloading UDFs works and what would be involved in rolling your own solution. The new Excel 2010 XLL SDK contains all the detailed documentation you would need. And remember that the Windows HPC team will be shipping a connector and container for Windows HPC. So if you have a Windows HPC cluster you will be able to use those components without having to care too much about how they work internally.

Summarizing …

Let’s end the post with a summary of the main points.

Excel 2010 is able to offload the calculation of cluster safe UDFs to a compatible cluster. By doing this, multiple long running UDFs can be calculated simultaneously and the overall time needed to calculate a workbook can be reduced.

Excel doesn’t connect directly to the cluster; it uses a proxy component called a connector. The connector understands how to interface with both Excel and the cluster. The connector communicates with Excel using a new API which will be documented in the Excel 2010 XLL SDK. How the connector interfaces with the cluster is vendor specific and the Windows HPC team will be shipping a working connector and container with their upcoming release.