In the following video, we talk about one such prototype that we put together quickly for a customer of ours. The requirements are typical of a large-scale document (correspondence) generation system: large-scale generation of documents, ability to author dozens of templates, ability to generate documents by binding the templates to data from business systems, ability to support multiple document formats and ability to create workflows to support the business processes.
Here we describe a solution for automated document generation using the Microsoft Office system. Combining out-of-the-box functionality like Content Controls and Open Office XML SDK with a little customization to your business rules, you can automate template creation, document generation, document conversion and (using SharePoint) allow for Web-based document management.
Read on for more about this solution…
Starting with Office 2007, Microsoft introduced the notion of Content Controls (a big improvement over the Bookmark Control of Office 2003). Content Controls allow a document template to be created using pre-defined pieces of content. These include text blocks, drop-down menus, combo boxes, calendar controls, etc.
Content Controls can be bound to XML elements, effectively providing the ability to define the document template using a “semantic markup.” The screenshot pasted below further illustrates this concept. Within the pane on the right, is a XML tree structure (this could be the schema returned by the underlying Policy service). It is possible to map elements of the XML tree to the content controls contained within the document.
Once the mapping is in place, new documents are then programmatically generated as copies of the template document. When the new document is opened, the placeholder content controls in the document are populated with business data from the underlying data feed.
Our recommended approach would be to provide a clean separation between template creation and document generation using a well-defined REST based API. This API would support both interactive and batch-oriented document generation. Since the API will be self-describing using the REST/Hypermedia pattern, we expect that it can be callable from any existing or new service applications.
So what do we mean by “self-descriptive” API? In addition to providing methods to generate documents by passing in the required data, the API will also provide a machine readable description of all the parameters needed by a given template. The benefit of this approach is that service clients don’t need hard-coded template details. Additionally, the templates can be changed without impacting the clients.
Under the covers, this API will be dependent on two Office technologies:
1) Open Office XML SDK: This SDK allows XML mapping to be replaced by data supplied by underlying data feeds.
2) Word Automation Service: This service is part of SharePoint and offers a highly scalable, server-side approach for converting Microsoft Word documents into other formats such as PDF.
The REST API will abstract the aforementioned implementation details from the service client. For instance, the service client will simply call a document generation method that takes template ID and the necessary parameters (self-described, as indicated earlier). Upon successful completion of the method, a URL of the generated document will be returned to the client. The URL could be pointing to a generated document resource located within a database, SharePoint document library or other storage, as needed. The service client application can then render the generated document within the service client as needed. While this example describes the interactive scenario, the batch scenario would work similarly.
Web-Based Editing of Generated Documents
Generated documents can be edited within the browser using Microsoft Office Web Apps, an online companion to Office Word. While light-weight editing is possible, there are a number of scenarios not supported by Microsoft Office Web App:
- Editing documents and using track changes to mark revisions.
- Editing Word objects, such as content controls.
- Using macros in Word, Excel and PowerPoint documents.
I would like to thank Sandeep Nahta from AIS for his help with building this prototype.