XQJ Part IX - Creating XDM instances

October 03, 2007 Data & AI

In the previous posts of the XQJ series, we have learned how to handle XDM instances as result of query execution; iterating through sequences, and get access to the items in the sequence. What if we want to create an XDM instance, without execution a query, can we?

XQJ offers functionality to create both XQSequence and XQItem objects. I mean, not as a result of a query execution, but rather as standalone XDM instances. This functionality is offered through the XQDataFactory interface. An XQDataFactory creates the following types of objects,

  • XQItem
  • XQSequence
  • XQItemType
  • XQSequenceType

Every XQConnection must implement the XQDataFactory interface. In XQJ 1.0 these are the only concrete XQDataFactory implementations, future versions might introduce different mechanisms to get access to an XQDataFactory.

Creating types

In the Typing post in this series, we have introduced the XQItemType and XQSequenceType interfaces. We have also learned how these objects are used to describe the static type of a query result and external variables. How do we create such type objects in our application?

Remember that XQJ defines a dozen of XQITEMKIND_XXX constants. For each of those there is a matching createXXXType method,

  • createAtomicType
  • createAttributeType
  • createCommentType
  • createDocumentElementType
  • createDocumentType
  • createElementType
  • creatItemType
  • createNodeType
  • createProcessingInstructionType
  • createSchemaAttributeType
  • createSchemaElementType
  • createTextType

Let's discuss some of the most common used methods in the above list.

The method createAtomicType(), creates an XQItemType object representing an XQuery atomic type. It accepts a single argument, an integer which is one of the predefined XQBASETYPE constants.
The next example create 3 XQItemType instances representing xs:integer, xs:string and xs:decimal,

[cc lang="java"]... XQItemType xsinteger = xqc.createAtomicType( XQItemType.XQBASETYPE_INTEGER); XQItemType xsstring = xqc.createAtomicType( XQItemType.XQBASETYPE_STRING); XQItemType xsdecimal = xqc.createAtomicType( XQItemType.XQBASETYPE_DECIMAL); ...[/cc]

Remember that every XQConnection is an XQDataFactory, in the example we've used our XQConnection xqc, to create these XQItemType instances. However, the XQItemType objects are completely independent of the connection.

Where the above example shows how to create XQItemType objects representing one of the built-in atomic XML Schema types, there is a second flavor of createAtomicType() for user-defined atomic types. Assume a hatsize user-defined atomic type derived from xs:integer in the http://www/hatsize.com schema,

[cc lang="java"]... XQItemType hatsize; hatsize = xqc.createAtomicType( XQItemType.XQBASETYPE_INTEGER, new QName("http://www.hatsizes.com", "hatsize"), new URI("http://www.hatsizes.com")); ...[/cc]

Beside atomic types, also element types are frequently used. In the next example we create an XQItemType representing element(person),

[cc lang="java"]... XQItemType type; type = xqc.createElementType( new QName("person"), XQItemType.XQBASETYPE_ANYTYPE); ...[/cc]

The first argument to createElementType() is a QName. Where in the example a person element in no namespace is created, the next example creates an element type person in the namespace http://www.foo.com. The second argument can be any of the predefined types, beside xs:anyType also xs:untyped is frequently used,

[cc lang="java"]... XQItemType type; type = xqc.createElementType( new QName("person","http://www.example.com"), XQItemType.XQBASETYPE_UNTYPED); ...[/cc]

The first argument can also be null, which is assumed to be the wild card, the following code snippet shows the creation of element(*, xs:untyped),

[cc lang="java"]... XQItemType type; type = xqc.createElementType( null, XQItemType.XQBASETYPE_UNTYPED); ...[/cc]

What about document-node() types? In the next example we create two XQItemType instances, a first representing any document and a second representing a well-formed untyped document,

[cc lang="java"]... XQItemType type1; XQItemType type2; type1 = xqc.createDocumentType(); type2 = xqc.createDocumentElementType( xqc.createElementType( null, XQItemType.XQBASETYPE_UNTYPED)); ...[/cc]

In addition to XQItemTypes, also XQSequenceType objects can be created. As explained before in the Typing post, an XQSequence consists of

  • an XQItemType
  • the cardinality to constraint the number of items, one of the OCC_XXX constants defined on XQSequenceType.

As such creating an XQSequenceType is simple. The next example shows how to create a xs:string* sequence type,

[cc lang="java"]... XQItemType itemType; XQSequenceType sequenceType; itemType = xqc.createAtomicType( XQItemType.XQBASETYPE_STRING); sequenceType = xqc.createSequenceType( itemType, XQSequenceType.OCC_ZERO_OR_MORE); ...[/cc]

Uisng types

So far so good, but why would one need to create all these types?

Assume an XQSequence, iterating over the items, if the item is a node retrieve is through the DOM, and get atomic values as Strings. This can be accomplished using the instanceOf() method, passing in an XQItemType object

[cc lang="java"]... XQItemType nodeType = xqc.createNodeType(); XQSequence xqs = ... ...

while (xqs.next()) { if (xqs.instanceOf(nodeType)) { org.w3c.dom.Node node = xqs.getNode(); ... } else { String s = xqs.getAtomicValue(); ... } } ...[/cc]

Some XQuery implementations have support for the Static Typing Feature as defined in XQuery. This requires implementations to detect and report type errors during the static analyses phase.
For expressions depending on the context item, the application must specify the static type of the context item. Why? In order to perform static typing, the implementation has to know the static type of the context item. The application has to provide the static type, and failing to do so, will result in an error being reported during the static analyses phase.

As the static type of the context item is a static context component, the XQJ XQStaticContext interface allows it to manipulate.
The next example shows to set the static type of the initial context item to document-node(element(*, xs:untyped)),

[cc lang="java"]... XQItemType documentType; documentType = xqc.createDocumentElementType( xqc.createElementType( null, XQItemType.XQBASETYPE_UNTYPED)); XQStaticContext xqsc = xqc.getStaticContext(); xqsc.setContextItemStaticType(documentType); ... XQPreparedExpression xqp; xqp = xqc.prepareExpression("//address", xqsc); ...[/cc]

As last use case of XQItemType, remember some of the examples of the previous post in this series, Binding external variables.
The bindXXX() methods defined on XQDynamicContext have all a third parameter, which allows to override the default Java to XQuery data type mapping.

In the next example we bind a java Integer to the external variable, but rather than using the default mapping to xs:int, specify to map it to a xs:short,

[cc lang="java"]... XQItemType xsshort; xsshort = xqc.createAtomicType(XQItemType.XQBASETYPE_SHORT); XQPreparedExpression xqp; xqp = xqc.prepareExpression( "declare variable $v as xs:short external; " + "$v + 1"); xqp.bindInt(new QName("v"), 22, xsshort); ...[/cc]

Creating XDM instances

Having discussed the ability to create XQItemType and XQSequenceType instances, XQDataFactory offers also the ability the create XQItem and XQSequence instances.

There is basically nothing new under the sun. If you understand the way binding to an XQDynamicContext works, as discussed in our previous post, you almost know how XQItem instances are created. For every bindXXX() method defined on XQDynamicContext, there is corresponding createItemFromXXX() method.

Let's show a simple example, binding a java.math.BigDecimal to an external variable $d,

[cc lang="java"]... XQExpression xqe = ... xqe.bindObject(new QName("d"),new BigDecimal("174"), null);[/cc]

And creating an XQItem of type xs:decimal from the same java.math.BigDecimal,

[cc lang="java"]XQItem xqi = xqc.createItemFromObject(new BigDecimal("174"), null);[/cc]

Note that the XQItem objects created through XQDataFactory are independent of any connection.

Suppose you execute a query returning a single item, and subsequently close the connection but still require access to the XQItem. Closing the XQConnection will invalidate the XQItem object resulting from the query execution. As such XQDataFactory has an XQItem copy method. createItem() accepts a single XQItem argument, and returns a (deep) copy of the specified item.
The following example shows how to make a query result available, also after closing the XQSequence or XQConnection,

[cc lang="java"]XQConnection xqc = ... XQExpression xqe = xqc.createExpression(); XQSequence xqs; xqs = xqe.executeQuery("(doc('book.xml')//paragraph)[1]"); xqs.next(); XQItem xqi = xqc.createItem(xqs.getItem()); xqc.close(); // although the connection is closed, xqi is still valid.[/cc]

Suppose you have an XML document which needs to be queried multiple times, but don’t want to go through the XML parsing overhead, each time it is queried. In the following example, two queries are executed and as such, books.xml will be parsed twice,

[cc lang="java"]... XQExpression xqe = xqc.createExpression(); xqe.executeQuery("fn:doc('book.xml')//paragraph[contains(.,'XQuery')]"); xqe.executeQuery("fn:doc('book.xml')//paragraph[contains(.,'SQL')]"); ...[/cc]

Or suppose you receive a transient XML stream, for example in a servlet environment, and need to query the stream multiple times. Then one way or the other the data will need to be buffered in order to query it more than once.

How can we make a) an XML document being parsed only once, b) in case the XML stream is transient, make it 'queryable' multiple times?

Suppose two XQPreparedExpression objects, xqp1 and xqp2. The next example will create first an XQItem representing the XML document, as such it will be parsed only once. Second, it will be bound to 2 different XQPreparedExpression object,

[cc lang="java"]... InputStream input = ... XQItem doc = xqc.createItemFromDocument(input, null); ... xqp1.bindItem(new QName("doc"), doc); ... xqp2.bindItem(new QName("doc"), doc); ...[/cc]

One of the disadvantages of such apporach, especially with large document, are the scalability aspects and memory consumption. For example, in case of DataDirect XQuery, the streaming capabilities will not be of much use as the complete XML document is instantiated in-memory. We'll come back to the topic of processing large input documents in a future post of the XQJ series.

Finally, XQDynamicContext also allows to create XQSequence objects. There is a createSequence() copy operation. I.e. with a single XQSequence argument, returning a copy of it. Similar to the XQItem example above, it allows to have query results outlive an XQConnection.

A second flavor of createSequence() accepts a java.util.Iterator, returning a sequence of items based on the objects returned by the Iterator. The objects are converted into XDM instances using the default object mapping defined in XQJ. For example, the following code snippet results in a sequence of xs:decimal instances,

[cc lang="java"]... // assume an ArrayList of BigDecimal objects ArrayList list = ... XQSequence s = xqc.createSequence(list.iterator()); ...[/cc]

Pipelines is the next topic we will discuss. How can one create a pipeline of xqueries, or pipelining an xquery with an XSLT transformations? Watch out for the next post.

digg_skin = 'compact'; digg_url = 'http://www.xml-connection.com/2007/10/xqj-part-ix-creating-xdm-instances.html';

XQJ

Marc Van Cappellen