XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (535 page)

Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online

Authors: Michael Kay

, but not (and here's the surprise) when it is
co
.

I've written
garc,on
to illustrate that the
c
and the cedilla are two separate Unicode codepoints. But of course the cedilla is actually a nonspacing character, so in real life this string of seven codepoints would appear on the page as
gar
ç
on
.

Java could instead have standardized on the composed form of the character, but the accent-blind matching would then not work:
contains(“gar
ç
on”, “c”)
would be false.

Now let's look at a case where a pair of characters represents a single collation unit. Here we turn back to Spanish, where in older publications
ch
collates after
c
and
ll
collates after
l
. We can set this up in Java by defining a
RuleBaseCollator
using a rule that defines
c
<
ch
<
d
and
l
<
ll
<
m
. (Modern Spanish practice follows the English collating rules, so I had to set up these rules myself.)

Other books

The Collaborator of Bethlehem by Matt Beynon Rees

A Devilishly Dark Deal by Maggie Cox

A Novel by A. J. Hartley

Bething's Folly by Barbara Metzger

A Killer in Winter by Susanna Gregory

Hookup List by J. S. Abilene

Hearts' Desires by Anke Napp

Undertow (The UnderCity Chronicles) by Stelmack, S. M.

Forager by Peter R. Stone

Touching Rune by S. E. Smith