XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (323 page)

) and then for tertiary differences (
a
versus
ä
). So you will usually want the sort algorithm to take all these differences into account.

Having said this, it's worth noting that XPath doesn't actually do sorting. If you want to sort data, you need XSLT or XQuery. XPath provides many functions for comparing strings, including comparing whether one string is less than another, but it can't actually sort a collection of strings into order.

It's also interesting to note that although XPath defines the set of collations as part of the static context, there's nothing in the XPath language definition that uses this information at compile time. Collations are used only at runtime, and requesting a collation that doesn't exist is defined as a dynamic error rather than a static error. The reason collations are in the static context is a carryover from XQuery. XQuery defines sorting of sequences using an
order
by
clause in which the collation must be known at compile time. The reason for this restriction is that XQuery systems running on large databases need to make compile-time decisions about which indexes can be used to access the data, and this can only be done by comparing the sort order requested in the query against the collation that was used when constructing the index.

Base URI

When an XPath expression calls the
doc()
function to load a document, the argument is a URI identifying the document. This may either be an absolute URI (for example,
http://www.w3.org/TR/doc.xml
) or a relative URI such as
index.xml
. If it is a relative URI, the question arises, what is it relative to? And the answer is: it is relative to the base URI defined in the static context.

Where XPath expressions are contained within an XML document, as happens with XSLT, it's fairly obvious what the base URI should be: it's essentially the URI of the document containing the XPath expression. (This isn't a completely clear-cut concept, because a document might be reachable by more than one URI. The thinking comes from the way URLs are used in a Web browser, where any relative URL in an HTML page is interpreted relative to the URL that was used to fetch the page that it contains. Generalizing this model has proved a fairly tortuous business.)

Where XPath expressions arise in other contexts, for example, if they are generated on the fly within a C++ program, it's far less clear what the base URI should be. So XPath delegates the problem: the base URI is whatever the host language says it is. The context dependency is made explicit by identifying the base URI as part of the static context, and as far as XPath is concerned, the problem disappears.

It's again worth noting that there is nothing in the XPath language semantics that causes the base URI to have any effect at compile time. It is used only at runtime, and then only when certain functions are used (including not only
doc()
but also
collection()
and
static-base-uri()
). The reason it's defined as part of the static context is the expectation that it will be a property of the document containing the text of the XPath expression.

Statically Known Documents and Collections

Later in the chapter (see page 567) we'll be looking at how the available documents and collections form part of the dynamic context of an XPath expression. Normally, one might expect that nothing is known at compile time about the documents that the query might access when the time comes to execute it. However, this isn't always the case, especially in a database environment. This information in the static context acknowledges that in some environments, an XPath expression might be compiled specifically to execute against a particular source document or collection of source documents and that the system might be able to use this knowledge at the time it compiles the expression.

This is especially the case in a system that does static type checking. One of the difficulties with static type checking arises when the XPath expression contains a construct such as:

doc(“invoice.xml”)/invoice/line-item[value > 10.00]

To perform strict static type checking on this expression, the system needs to know what the data type of
value
is. If
value
were a date, for example, then the expression would be in error (you can't compare a date with a number), and the type checker would have to report this. But how can we know what the type of
value
is, if we don't know in advance what type of document
invoice.xml
is?

Other books

Puzzle for Fiends by Patrick Quentin
See Naples and Die by Ray Cleveland
The Fertile Vampire by Ranney, Karen
Pickle Puss by Patricia Reilly Giff
Catharine & Edward by Marianne Knightly
The Harvest by Vicki Pettersson
Keeping Her Love by Tiger Hill