You can use your favorite social network to register or link an existing account:
Or use your email address to register without a social network:
Sign in with these social networks:
Or enter your username and password
Forgot your password?
Yes, please link my existing account with for quick, secure access.
No, I would like to create a new account with my profile information.
Continuing my series of posts about Word Automation Services, I wanted to talk specifically about the things we did around two of our focus areas of the first release: performance and scale.
One of the overall goals for Word 2010, both clients and servers, was ensuring that we built a version of Word that was better and faster than anything we've previously released. When we started building a server-based solution for manipulating Word documents, we took that message around performance to heart – it was clear that one of our primary objectives had to be ensuring that the service could scale to "server-like" loads; something that the previous "solution" of just running the client was ill-equipped to do, as it was optimized to be run on an interactive desktop by a single user. That goal meant answering a few important questions:
The answers to those questions resulted in work that fell into three distinct buckets: raw performance improvements, reducing resource contention, and the creation of a persistent queue.
The first set of improvements for Word Automation Services focused on its "raw speed" – how fast the service could process a single file. Our plan here primarily focused on answering the question: What does the desktop version of Word need to do that we don't need to do on the server? Each answer to that question gave us something to focus on removing from the service, improving its performance characteristics.
This meant doing an inventory of Word, of sorts, and realizing that we didn't need things that ranged from the incredibly obvious (Ribbon and other UI-related code) to the obscure (querying the registry for the friendly name for embedded objects, which we do so you can see them in the status bar when the object gets focus: ). It also meant that we had to update assumptions as basic as the fact that we needed to try to update every field in the document; given that a server process operates in a restricted-rights environment without access to remote files, the registry, or a user identity, we can eke out small gains by not updating INCLUDETEXT/AUTHOR/etc. fields at all.
In the end, we were able to create an engine that ranges between 10% (DOCX->DOCX) and 30% (DOCX->PDF) faster than the desktop application on similar hardware when performing the actions supported by the service (document conversion). Our focus on a few core scenarios enabled us to optimize our engine for those tasks.
If you've ever tried to use the desktop version of Word to do server-side automation, I'm sure you've run into an example of the traditional problems of this type: error dialogs that "normal.dot is in use", severe slowdown in performance with multiple processes running, etc.
When we set out to build a server-ready version of Word, it was clear that this class of issues was something we had to tackle – the service needed to be able to scale efficiently to machines with 8 cores of processing power (high-end today, widely-available in the not-so-distant future).
This meant a long process of measurement and analysis in which we looked at our scale barriers (GDI contention, disk contention, etc.) and worked through them one-by-one – doing things like making sure we never depended on a disk-based resource (temporary files, etc. needed to be memory based), as well as optimizing our use of system-wide resources like GDI locks.
This work didn't make Word faster, but it did result in a service that scales linearly up to four simultaneous conversions on a single machine, and which can be scaled out among many machines – a significant improvement over desktop Word, and one we'll continue to build on in future versions.
Even with all of those improvements in place, it was obvious that our service would often be unable to keep up with incoming requests – if you ask to convert 10,000 Word documents to PDF, even the fastest engine needs some time to process that workload.
To handle this, we built the service to keep a queue, enabling us to receive peaks of work and process them as resources allowed; knowing that we're processing arbitrary input documents, we then went a step further and made this queue persistent, so that a single rogue document, machine hiccup, etc. didn't cause a job of thousands of items to stop mid-processing with no clear indication of what was completed and what was not.
We'll be publishing more precise data on how the server scales both up and out as part of Capacity Planning guidance for SharePoint 2010; laying a solid foundation here was definitely one of our goals.
- Tristan
Comments: (3) Collapse
I'm sorry donn't understand, I'm deaf and woman , 56 years old, need visual for exmple how do it, and I not good read/ write english and take very slow for get understand. Thanks you for support, please give me easy wrods english for send me email. Regards Malka
I was wondering if MS has any plans to expand this great functionality to other office products such as Excel and PPT/Visio printing / Outlook MSG and EML? Regards,
Mark
@Mark: Thanks for the feedback - I'll definitely pass this on to those folks as a request to see this in more places. - Tristan
Comments: (loading) Collapse