Word Automation Services: How it Works

Share on Facebook Share on Twitter Share on Linkedin Share via OneNote Share via Email Print

In my first two posts in the series on Word Automation Services, I talked about what it is and what it does – in this post, I wanted to drill in on how the service works from an architectural standpoint, and what that means for solutions built on top of it.

Word Power on the Server

The most important component of Word Automation Services is getting a core engine with 100% fidelity to desktop Word running on the server – accordingly, much of our effort was focused on this task. If you’ve ever tried to use desktop Word on the server, you’re acutely aware of the work that went into this – we needed to “unlearn” many of the assumptions of the desktop, e.g.:

  • Access to the local disk / registry / network
  • Assumption of running in user session / with an associated user profile
  • Ability to show UI
  • Ability to perform functions on “idle”

This means architecture changes that run the gamut from huge, obvious ones (e.g. ensuring that we never write to the hard disk, in order to avoid I/O contention when running several processes in parallel) to small, unexpected ones (e.g. ensuring that we never recalculate the AUTHOR field, since there’s no “user” associated with the server conversion).

What this means for you: we’ve built an engine that’s truly optimized for server – it’s faster than client in terms of raw speed, and it scales up to multiple cores (as we eliminated both resource contention and cases where the app assumed it lived “alone” – access to normal.dotm being one example that’s familiar to folks who’ve tried to do this before) and across server farms through load balancing.

Plugging into SharePoint Server 2010

Having this engine is one step, but we also needed to integrate it into SharePoint Server 2010, enabling us to work within a server ecosystem with other Office services.

To do this, we needed an architecture that enabled us to both:

  1. Have low operational overhead when configured, leaving CPU free to perform actual conversions (“maximum throughput”)
  2. Prevent our service from eating all the resources on an application server whenever new work was provided (“good citizenship”)

The result is a system that’s asynchronous in nature (something I’ve alluded to in previous posts). Essentially, the system works like this:

FlowChart: How It Works

  1. You submit a list of file(s) to be converted via the ConversionJob object in the API
  2. That list of files is written into a persisted queue (stored as a SQL database)
  3. On regular (customizable) intervals, the service polls for new work that needs to be done and dispenses this work to instances of the server engine
  4. As the engine completes these tasks, it updates the information in the queue (i.e. marks success/failure) and places the output files in the specified location

What that Means

That has two important consequences for solutions:

  • First, it means that you don’t know immediately when a conversion has completed – the Start() call for a ConversionJob returns as soon as the job is submitted into the queue. You must monitor the job’s status (via the ConversionJobStatus object) or use list-level events if you want to know when the conversion is complete and/or perform actions post-conversion.
  • Second, it means that maximum throughput is defined by the frequency with which the queue is polled for work, and the amount of new work requested on each polling interval.

Dissecting those consequences a little further:


The asynchronous nature of the service means you need to set up your solutions to use either list events or the job status API to find out when a conversion is complete. For example, if I wanted to delete the original file once the converted one was written, as commenter Flynn suggested, I would need to do something like this:

Now, clearly using Thread.Sleep isn’t something you’d want to do if this is going to happen on many threads simultaneously on the server, but you get the idea – a workflow with a Delay activity is another example of a solution to this situation.

Maximum Throughput

The maximum throughput of the service is essentially mathematically defined at configuration time:

By default, these values are:

You can tune the frequency as low as one minute, or increase the number of files/number of worker processes to increase this number as desired, based on your desire to trade off higher throughput and higher CPU utilization – you might keep this low if the conversion process is low-priority and the server is used for many other tasks, or crank it up if throughput is paramount and the server is dedicated to Word Automation Services.

We recommend that, for server health, that two constraints are followed in this equation:

  • Number of worker processors <= # of CPUs – 1
  • # of items / frequency <= 90

Of course, by adding CPU cores and/or application servers, this still allows for an unbounded maximum throughput.

That’s a high-level overview of how the system works – in the next post, I’ll drill into a couple of scenarios that illustrate typical uses of the service.

- Tristan