XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (84 page)

Apart from this,
xsi:nil
behaves in XSLT just like any other attribute.

You do need to be a little careful if you want to put your stylesheets through a schema processor (which you might do, for example, if you store your stylesheets in an XML database). The schema processor attaches a special meaning to attributes such as
xsi:nil
,
xsi:type
, and
xsi:schemaLocation
, even though XSLT does not. It's therefore best to avoid using these attributes directly on literal result elements. Two possible ways round this problem are:

  • Generating these attributes using the

    instruction instead.
  • Using a namespace alias for the
    xsi
    namespace: See the description of the

    declaration in Chapter 6 (page 394).

Summary

Firstly, a reminder of something we said at the beginning of the chapter, and haven't touched on since: schema processing in XSLT 2.0 is optional. Some XSLT 2.0 processors won't support schema processing at all, and even if you are using a processor that is schema-aware, you can still use it to transform source documents that have no schema into result documents that have no schema.

We started this chapter with a very quick tour of the essentials of XML Schema, describing the main concepts of element and attribute declarations and simple and complex types, and discussing the role that they play in XSLT processing.

There are two main roles for schemas in XSLT, which are strongly related. Firstly, XML Schema provides the type system for XSLT and XPath, and as such, you can define the types of variables, functions, and templates in terms of types that are either built into XML Schema, or defined as user-defined types in a specific schema.

Secondly, you can use an XML Schema to validate your source documents, your result documents, or intermediate working data. This not only checks that your data is as you expected it, which helps debugging, but also annotates the nodes in the data model, which can be used to steer the way the nodes are processed, for example by defining template rules that match particular types of node.

The mechanism that binds a stylesheet to one or more schemas is the

declaration, and we looked in some detail at the way this works.

In the next chapter we will look more closely at the way types are used in XSLT and XPath processing, including a survey of the built-in types that are available whether or not you use a schema.

Chapter 5

Types

This chapter looks in some detail at the XPath type system; that is, the types of the values that can be manipulated by XPath expressions and XSLT instructions.

XPath is an expression language. Every expression takes one or more values as its inputs, and produces a value as its output. The purpose of this chapter is to explain exactly what these values can be.

Chapter 2 presented the XDM tree model with its seven node kinds—that's part of the picture, because XPath expressions will often be handling nodes in a tree. The other half of the picture is concerned with atomic values (strings, numbers, booleans, and the like), and it's these values that we'll be studying in this chapter.

One of the things an expression language tries to achieve is that wherever you can use a value, you can replace it with an expression that is evaluated to produce that value. So if
2 + 2
is a valid expression, then
(6 − 4) + (1 + 1)
should also be a valid expression. This property is called
composability
: expressions can be used anywhere that values are permitted. One of the important features that make a language composable is that the possible results of an expression are the same as the possible inputs. This feature is called
closure
: every expression produces a result that is in the same space of possible values as the space from which the inputs are drawn.

The role of the data model is to describe this space of possible values, and the role of the type system is to define the rules for manipulating these values.

What Is a Type System?

Let's make sure that when we talk about a type system, we're talking the same language.

Every programming language has some kind of type system. A language manipulates values, and the values are of different types. At the simple level, they might be integers, booleans, and strings. Then the language might support various kinds of composite types; for example, arrays or records or lists. Most modern languages also allow users to define their own types, on top of the basic types provided “out of the box”.

So, types are used to classify the values that can be manipulated by expressions in the language, and the type system defines the basic types provided by the language as well as the facilities for defining new types by combining and refining existing types.

A type serves two main purposes. Firstly, it defines a set of permissible values. For example, if you say that a function expects a positive integer as its first argument, then the phrase “positive integer” tells you what the valid values for the first argument are.

Secondly, a type defines a set of possible operations. Integers can be added, lists can be concatenated, booleans can be combined using the operators
and
,
or
, and
not
.

Not only does the type tell you whether a particular operation is permitted on a value of that type, it determines how that operation will be performed. So integers, strings, dates, and high school grades can all be sorted into order, but the way they are sorted depends on their type. Operations that are performed in different ways depending on the type of their operands are called
polymorphic
operations (from Greek words meaning
many shapes
).

Types are useful in programming languages for a number of reasons:

  • Types allow errors to be detected, including programming logic errors and data errors. Because a type defines a set of permissible values, the system can give you an error message when you try to use a value that is not permissible. And because a type defines a set of allowed operations, the system can also give you an error message if you try to apply an operation to the wrong kind of value.
  • Types allow polymorphic operations to be defined. At a simple level, this allows
    A < B
    to mean different things depending on whether A and B are numbers or dates or strings. At a more sophisticated level, it allows the kind of inheritance and method overriding which is such a powerful tool in object-oriented programming.
  • Types allow optimization. To make expressions in a language such as XPath run as fast as possible, the system does as much work as it can in advance, using information that is available at compile time from analysis of the expression itself and its context. A lot of the reasoning that can be done at this stage is based on analysis of the types of values that the expression will process. For example, XPath has a very powerful
    =
    operator, in which the operands can not only be any type of value (such as integers or strings) but can also be sequences. Handling the general case, where both operands are arbitrary sequences containing items of mixed types, can be very expensive. In most cases the operands are much simpler; for example, two integers or two strings. If the system can work out in advance that the operands will be simple (and it often can), then it can generate much more efficient code and save a lot of work at runtime.

Other books

Avenger (Impossible #3) by Sykes, Julia
Granta 125: After the War by Freeman, John
Totlandia: Spring by Josie Brown
James, Stephanie by Fabulous Beast
The Family Plot by Cherie Priest
The Hairdresser Diaries by Jessica Miller
The Mandates by Dave Singleton
Novak by Steele, Suzanne