XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (719 page)

Most of these fields are optional and repeatable. Something I haven't captured in this schema is that the GEDCOM spec also says the structure is extensible; arbitrary namespaced elements may be inserted at any point in the structure. This is typically used to contain information specific to a particular product vendor, so that GEDCOM can be used to exchange data between users of that product with no loss of information. This can be handled in XML Schema by using wildcards, but only if they appear after other elements (this restriction disappears in XML Schema 1.1).

The Stylus Studio converter makes
IndividualRec
and all other elements into top-level element declarations in the schema. This isn't needed for validation, since in a GEDCOM file the
IndividualRec
will always be a child of the

element. However, it makes this element name available in stylesheets, which is a great convenience; for example, I can write a function whose parameter is declared as
as=“schema-element(IndividualRec)”/>
.

Having made
IndividualRec
a top-level element declaration, there seems to be nothing that would be gained by naming its complex type as a top-level type definition. In general, the only types that are worth naming as top-level types are those that are used in more than one place, or at least look likely to be used in more than one place.

For the child elements of
IndividualRec
, the converter chose to use a global element declaration referring to a local (anonymous) type. There's nothing absolute about this; one could equally use a local element with a global type. As far as validation is concerned, you could also use a local element with an anonymous type, but this is not a good idea if you want to reference the schema from a stylesheet. When it comes to writing an XSLT stylesheet, it's important that where a data element such as
Date
appears in several places, it should either use a global element declaration or a global type definition, so that you can reference one or the other when you declare variables and parameters, and when you write match patterns.

There are no substitution groups in this model. They aren't needed, because the model has chosen to use generic elements like

rather than specialized types such as

and

. The need for substitution groups generally arises when there are many elements that are structurally interchangeable.

Events

An event record has this structure:


  

    

      

      

      

      

      

    

    

    

    

  


The
Religion
element, of course, has a special place because so many of the events affecting our forebears were recorded by the religious authorities.

Families

The third object type we will look at is the
family
. Here is the definition:


  

    

      

      

      

      

      

    

    

  


Again, many of the fields are common with the other two object types. The elements
HusbFath
,
WifeMoth
, and
Child
play a crucial role in linking the data, so we'd better open them up:





  

    

  



  

    

      

        

        

        

      

    

       



  

    

      

        

      

    

     


A

element represents the participation of an individual in a family in the role of child. The

identifies the individual concerned. The

represents the position of that child in the family (1 for the eldest child, and so on); this allows for the fact that some of the children may be unknown.

and

elements allow for detail about the relationship of the child to the father and mother; for example, the child may be the natural child of one parent and the adopted child of the other.

The

element represents the participation of an individual in a family in the role of parent. The

element provides a sequence number; for example, it allows you to say that this family is the man's second marriage, which is useful if the dates of the marriages are not known.

Now let's look quickly at the three most common (and difficult) datatypes used for properties of these objects: dates, places, and personal names.

Dates

As we've seen, GEDCOM allows any character string to be used as a date. However, much of the presentation of data depends on analyzing dates wherever possible. How is this dilemma resolved?

The
Date
element referenced from the
Event
record has a complex type, defined like this:


  

    

      

    

  


That is to say, it is a complex type with simple content: the content is a
GeneralDate
, and the optional attribute indicates which calendar is used. The
GeneralDate
can be any character string, but certain formats such as
DD
MMM
YYYY
are recommended.

As far as validation is concerned, there isn't much point in defining a schema type for the pattern
DD
MMM
YYYY
. However, it turns out that it can be useful to define this type even if it isn't used for validation. We can define the GEDCOM date format as a union type like this:


  



  

     

“[0-9]?[0-9]\s(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)\s[0-9]{4}”/>

  


This type is meaningless from the point of view of validation—all strings will be considered valid. But the effect is that a date that conforms to the
DD
MMM
YYYY
pattern will be labeled as a
StandardDate
, while one that doesn't will be labeled only as an
xs:string
. This will prove useful when we write our stylesheets, because it becomes very easy to separate standard dates from nonstandard dates when we want to perform operations like date formatting and sorting. In fact, I could have usefully split dates into three categories: simple exact dates like
4
MAR
1920
; inexact dates that conform to the GEDCOM syntax, such as
BEF
JAN
1866
(meaning some time before January 1866); and arbitrary character strings whose interpretation is left purely to the reader.

Other books

Infinity's Daughter by Laszlo, Jeremy
The Yellow World by Albert Espinosa
Double Jeopardy by William Bernhardt
Our Young Man by Edmund White
Long Division by Taylor Leigh
Funeral with a View by Schiariti, Matt
Repo (The Henchmen MC Book 4) by Jessica Gadziala
Step Back in Time by Ali McNamara