XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (671 page)

Summary

This chapter provided a rather technical definition of the regular expression syntax provided for use in the XPath functions
matches()
,
replace()
, and
tokenize()
, and in the XSLT

instruction.

Having given this very detailed definition of the regex grammar, it's worth including a warning that some processors may cut corners by exposing whatever regex dialect is supported by their existing libraries. Microsoft's XML Schema processor, for example, uses the .NET regex dialect rather than the one defined by W3C.

Caveat emptor!

Chapter 15

Serialization

Serialization in an XSLT context means the process of taking a result tree (the output of a transformation) and converting it into lexical XML, usually as a file in filestore. XSLT also allows serialization into other formats, including HTML and text files.

As mentioned in Chapter 2, although serialization is not part of the core function of an XSLT processor, the language provides constructs such as

that enable you to control the process from within a stylesheet. Many products may also allow you to invoke the serializer as a separate component. With XSLT 2.0, the specification of serialization has been moved into a separate W3C Recommendation, to allow reuse of the facilities from within other XML processing languages such as XQuery and XProc. You can find the W3C specification at
http://www.w3.org/TR/xslt-xquery-serialization/
.

Serialization is controlled by a set of parameters, each of which has a name and a value. The most important parameter is
method
, which takes one of the values
xml
,
html
,
xhtml
, or
text
. This determines which serialization method is used (user-defined or vendor-defined serialization methods are also allowed, but are outside the scope of this book). When serialization is invoked from XSLT, the serialization parameters are generally controlled using the attributes of the

or

instructions described in Chapter 6. It is often possible, however, to set further parameters from the invoking application, or as options on the command line.

In this chapter, we will start by examining each of the four output methods in turn: XML, HTML, XHTML, and TEXT. Then we'll look at other serialization capabilities in the XSLT specification, notably character maps and
disable-output-escaping
.

Details of the syntax of elements such as

,

, and

are found in the appropriate alphabetical sections in Chapter 6.

The XML Output Method

When the output method is
xml
, the output file will usually be a well-formed XML document, but the actual requirement is that it should be either a well-formed XML external general parsed entity or a well-formed XML document entity, or both.

An external general parsed entity is something that could be incorporated into an XML document by using an entity reference such as
&doc;
. The following example shows a well-formed external general parsed entity that is not a well-formed document:

Other books

My Name Is Chloe by Melody Carlson
Joan Hess - Arly Hanks 07 by O Little Town of Maggody
Blind Faith by Ben Elton
Harm's Way by Celia Walden
Doctor Who: Time Flight by Peter Grimwade
Crime & Passion by Chantel Rhondeau
Tempting Aquisitions by Addison Fox