Generating non-XML files from XQuery is easier than you think!

October 20, 2008 Data & AI

There is an ongoing discussion thread on a popular XQuery discussion list. The topic is about how to generate a non-XML result (CSV in the specific case discussed there) from an XQuery. Seems easy, but there are a lot of little issues in doing so, especially with the handling of whitespaces.
Ivan, who is Program Manager for the XML products here at DataDirect, has chimed in describing how easily DataDirect XQuery can leverage XML Converters to make XQuery generation of non-XML results easy. The XML document that needs to be transformed into CSV looks like this:[cc lang="xquery"]

A 33 1,5

A 33 0,5

B 22 0,5

[/cc]...the solution proposed by Ivan is indeed quite simple and elegant:[cc lang="xquery"]declare option ddtek:serialize "method=CSV";[/cc]...looks pretty easy, doesn't it? What the "method=CSV" option is telling DataDirect XQuery is to use the CSV XML Converter to serialize data; XML Converters transform a ...... structure into CSV; there are of course a number of options available to tune that behavior (like, for example, listing the field names in the first row), but the underlying mechanism is quite simple. XML Converters use the same approach for other conversions, and the same mechanism is available from XQuery: make XQuery generate an XML structure consistent with what XML Converters expect (and you can see that asking XML Converters to generate the XML schema for a specific conversion through tools or API), and then just specify the proper conversion method (and options) in the ddtek:serialize option. We have seen the same concept in other posts on this blog, like this one related to EDI, for example.
In the above mention discussion thread, the original poster goes on asking how to achieve something a bit more complicated: "I am not just copying row-type elements from the input file, but rather summing up the values found in the "quantity" children of the entries, after grouping them by the content of "property1" and
"property2" (see the first message in this thread)."
Easy enough; just reusing the example posted at the beginning of the email thread, you can do:[cc lang="xquery"] declare option ddtek:serialize "method=CSV:sep=;"; { let $entries := /list/entry for $property1 in distinct-values($entries/property1), $property2 in distinct-values($entries[property1 = $property1]/property2) let $selected_entries := $entries[property1 = $property1 and property2 = $property2] order by $property1, $property2 return

{$property1}, {$property2}, { replace(xs:string( sum(for $quantity in $selected_entries/quantity return xs:decimal(replace($quantity, ",", ".")))), ".", ",") }

} [/cc] The result, is what you would expect:
A;33;2
B;22;0,5

As already noted on this Blog, XQuery is a great language for querying, transforming, manipulating XML and other data sources (RDBMS, EDI, a combination of all of them); when used in conjunction with XML Converters, you are able to bridge between the XML and non-XML world very easily, quickly creating very powerful solutions.

Minollo