Designing Document-Oriented Services

Sean Brydon, Smitha Kangath, Sameer Tyagi

Problem Description

A Java[TM] 2 Platform, Enterprise Edition (J2EE[TM] platform) 1.4 application might need to pass an XML document using Java API for XML-based RPC (JAX-RPC). For example, an application might have a service that accepts a PurchaseOrder.xml or a service that accepts an Invoice.xml document. This solution focuses on document-based services where the service models a document-processing problem. We are focusing on document-based services instead of procedure-based services. A developer might need to create web services where data exchanged is in the form of XML documents conforming to pre-negotiated XML schema. These schema might be application specific or defined by vertical industry standards bodies. In this solution, we don't distinguish between synchronous or asynchronous processing of the XML document, instead we just focus on passing an XML document, regardless of the processing model. The solution can apply to either type of service processing model. One of the key design choices is determining the type in the interface artifacts to use to represent the XML business document being exchanged.

When designing a web service interaction where a client and service exchange an XML document, developers need to choose a type to represent the XML document in the service interface. The type for the XML document being exchanged as a parameter or as a return value is expressed in the WSDL for the service, and in the Java interface and implementation classes of the service. When designing a service that is receiving an XML document like a PurchaseOrder.xml, the interface and WSDL file must choose a type to represent the XML document. For example it can use a string as the type for the purchase order xml document, or it can use an object style and embed the description of all the document details and schema details into the interface, or it can use an XML fragment to represent the purchase order document by choosing a type such as xsd:anyType, as well as some other choices. This is a key decision when designing the service.

This decision on the type chosen to represent the xml document for a service interface has many impacts. For example, on the service, it would change the types of the parameters (and return values) in the WSDL file, the types of the parameters in the Java interface and implementation class, the handling of the received document for such things as validation of the document, binding from XML to Java objects, and so forth. The client of the service would also be impacted by the design choice for the type to represent the XML document in the interface and WSDL of the service. For example, on the client side it would change the client code because it would need to construct the parameters for that xml document being exchanged, it would impact the characteristics of the serialized document on the wire, and for documents being returned it would impact how the received document is handled. Those are some of the issues developers need to consider when choosing a type in the service interface to represent an XML document being exchanged.

Another consideration is that some environments might plan to use management or monitoring tools for services, especially at the SOAP layer, and some types are more opaque to monitoring tools than others. For example, strings might be hard for tools to handle because their representation on the wire contains some extra information in the encoding. So the wire level view of a signature can impact the monitoring capabilities. Of course, interoperability can be affected by the choice of exposed parameter types, so this is also a concern.

This solution is applicable in the following circumstances:

Solution

When designing a web service interaction where a client and service exchange an XML document, developers need to choose a type to represent the XML document in the service interface. This decision on the type chosen to represent the xml document for a service interface has many impacts.

This solution briefly summarizes and compares the different strategies, and then offers some guidelines to choose a strategy. The focus will be on the interaction layer and not on the business processing of the XML document. Each of these strategies is defined in detail in other related entries in the catalog, so it is highly recommended that you check out each of these related entries and their applications. The following strategies along with example applications are covered:

Some other strategies might be included later, such as using base64Encoded or raw bytes in the SOAP body, and using no data binding.

Briefly some of the participants in this problem and solution are:
The choice of type to represent the XML documents being exchanged between a client and a web service can impact all of those particpants. For example, the type chosen will need to be included in the WSDL of the file, so any Java classes generated by the service implementation will be affected, and this can impact the existing object model of the application which is adding a web service layer. The client applications accessing the service will be impacted because they will need to generate Java classes for the types in the service WSDL file and will need to transform their own client data model to the service types. This type of choice can affect the interoperability of the service. The type can also affect whether the JAX-RPC runtime can do the binding to Java objects or whether your application code will need to do this binding instead.

Summary and Recommendations
When choosing a type for the service interface, we recommend the following:
Table 1 summarizes the various strategies for choosing a type to represent the XML documents being exchanged with a document-oriented web service. Other entries in the Java BluePrints Solutions Catalog covers these topics in more details.

Strategy

Advantages

Disadvantages

Using schema-defined types
(doc-literal style)

  • Interoperability
  • Validate against schema if XML docs are used
  • Better performance than encoded fomatting styles
  • Cannot use custom bindings or binding frameworks directly
  • Service interface clearly describes the types of documents expected. This makes the WSDL file easier to understand  for clients.

Using string

  • Simple, same as writing a "hello world" application.
  • Schema validation offered by the runtime cannot be used, and errors with the document will not be picked up until the service has read the document in memory and attempted to process it.
  • Service interface is not descriptive because the document type is just a general string.

 

Using xsd:any

  • The mapping of the xsd:any has been standardized to map to SOAPElement with JAX-RPC 1.1

Even though an element is named in the WSDL (for example, BusinessDocumentRequest) and the business document passed appears inside these elements on the wire, the web service its client can still work with complete XML documents and maintain schema integrity without having to include document content as under these elements (this is not the case with the anyType strategy).

  • Requires developers to work at the lower levels of XML because they now have to work with creating and manipulating SOAPElement objects.
  • There is no cohesiveness between the WSDL and the documents because the schemas defining the documents are not referenced directly. 
  • Schemas need to be negotiated out of band. Both service provider and consumer need a priori knowledge of the contents of the payload because the WSDL file does not describe the schema of the expected documents.

Using xsd:anyType

  • Allows the action and the payload to be passed together. This can be useful when creating a polymorphic processor that accepts multiple document types with the same actions, for example, a single service that performs a search action on a purchase order and an invoice, both of which conform to different schemas.
  • JAX-RPC specification does not define standard Java mapping for the xsd:anyType, so not all implementations will behave like the Java WSDP and map this to a SOAPElement. In fact, support for the xsd:anyType is optional for an implementation.
  • Because the anyType actually defines the data type for a named element in the WSDL, the business document being passed in the SOAP body is located inside this element identified in the WSDL, for example, the PurchaseOrder is inside the BusinessDocumentRequest element. This means that the document being passed now must either:
    • Have its root element identified in the WSDL
    • Be constructed appropriately or wrapped in the element on the fly

Using attachments

  • Useful for documents that might conform to schemas expressed in languages not supported by the web service endpoint or that are prohibited from being present within a SOAP message infoset (such as the Document Type Declaration <!DOCTYPE> for a DTD-based schema.
  • Useful for large documents (can be compressed-decompressed).
  • Additional facilities can be built on the attachments using handlers, for example, a client-side handler that adds custom encryption using passwords to the attachment and then compresses it using the GZIP algorithm. The corresponding server side handler decompresses the content from the attachment and decrypts it using the same password based encryption algorithm.
  • Interoperability: Not all vendors support attachment. For example, .NET uses its own mechanism though an extensions pack and cannot handle MIME attachments based on WS-I attachment profile.
Table 1: Comparison of Various Strategies

In addtion to the strategies listed in Table 1, there are other variations and other strategies that could be used in some circumstances. For example, in some circumstances you might not want to use the binding mechansims of JAX-RPC and might  choose to not use any data binding. This is useful if you want to integrate with an API like JAXB and have it do the binding. The downside of such an approach is that the behavior is specific to the runtime. For example, the  Java Web Services Developer Pack allows a –nodatabinding switch. Other implementations might not support such capabilities. Another variation that could be tried is using base64 encoding. This might be useful when the XML contains characters or declarations that are not supported either by the SOAP message info-set or by the runtime implementation. Examples of these are DTD declarations and locale specific character encoding, and so forth. Note this has interoperability issues. Although these and other options are available, generally it is best to follow the guidelines in the Summary and Recommendations section.

References

For more information about this topic, including detailed solutions and example applications for the various strategies discussed in this document, refer to the following:

© Sun Microsystems 2005. All of the material in The Java BluePrints Solutions Catalog is copyright-protected and may not be published in other works without express written permission from Sun Microsystems.