100% found this document useful (1 vote)
88 views43 pages

X Cert1424 A4

This tutorial deals with XML transformations, and is the fourth in a series of five tutorials. It helps you prepare for the IBM certification Test 142, XML and Related Technologies. This certification identifies an intermediate-level developer who designs and implements applications that make use of XML and Related Technologies.

Uploaded by

api-3830849
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
88 views43 pages

X Cert1424 A4

This tutorial deals with XML transformations, and is the fourth in a series of five tutorials. It helps you prepare for the IBM certification Test 142, XML and Related Technologies. This certification identifies an intermediate-level developer who designs and implements applications that make use of XML and Related Technologies.

Uploaded by

api-3830849
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

XML and Related Technologies certification prep,

Part 4: XML transformations


Convert, search, traverse, and format your XML data

Skill Level: Intermediate

Brian L Brinker ([email protected])


Senior IT Specialist
IBM

10 Oct 2006

When an application needs to share data with another system, it is often necessary
to transform an XML document into another XML format, governed by a differing XML
Schema or Document Type Definition (DTD). When the app is required to share or
display XML data to a user, the XML document might be transformed into HTML,
Scalable Vector Graphics (SVG), VoiceXML, plain text, or any of a large number of
human-readable formats. This tutorial deals with XML transformations, and is the
fourth in a series of five tutorials that you can use to help prepare for the IBM
certification Test 142, XML and Related Technologies.

Section 1. Before you start


In this section, you'll find out what to expect from this tutorial and how to get the
most out of it.

About this series


This series of five tutorials helps you prepare to take the IBM certification Test 142,
XML and Related Technologies, to attain the IBM Certified Solution Developer - XML
and Related Technologies certification. This certification identifies an
intermediate-level developer who designs and implements applications that make
use of XML and related technologies such as XML Schema, Extensible Stylesheet
Language Transformation (XSLT), and XPath. This developer has a strong
understanding of XML fundamentals; has knowledge of XML concepts and related

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 1 of 43
developerWorks® ibm.com/developerWorks

technologies; understands how data relates to XML, in particular with issues


associated with information modeling, XML processing, XML rendering, and Web
services; has a thorough knowledge of core XML-related World Wide Web
Consortium (W3C) recommendations; and is familiar with well-known, best
practices.

About this tutorial


This tutorial is the fourth in the "XML and Related Technologies certification prep"
series and focuses on XML transformations. It details how to surface data within
XML documents in any number of ways. It assumes and builds upon the groundwork
laid in Part 3, which focused on XML processing, and also in Part 2, which discussed
validation of XML through DTD and XML Schema (see Resources). Without this
foundation, XML transformations would not be possible.

This tutorial is written for programmers and scripters who have a basic
understanding of XML and whose skills and experience are at a beginning to
intermediate level. As such, you should have a general familiarity with data types,
including arrays, graphs, and trees in particular. You should also be familiar with
general programming techniques such as iteration and recursion. Although this
tutorial begins with the basics of the technologies discussed, it is not intended to be
a comprehensive reference. However, if studied well, this tutorial, combined with the
references in Resources, will provide sufficient breadth and depth to master the
transformation aspects of the XML certification exam.

Objectives
After completing this tutorial, you will:
• Understand how to use XSLT to transform XML
• Be able to do string and math operations and to search and traverse XML
with XPath
• Know how to visually format XML with CSS

Prerequisites
This tutorial is written for developers who have a background in programming or
scripting and who have an understanding of basic computer-science models and
data structures. You should be familiar with the following XML-related,
computer-science concepts: tree traversal, recursion, and reuse of data. You should
be familiar with Internet standards and concepts, such as Web browser,
client-server, documenting, formatting, e-commerce, and Web applications.

System requirements

XML transformations
Page 2 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

As with Part 3 of this series, you need a Linux® or Microsoft® Windows® box with at
least 50MB of free disk space and administrative access to install software. This
tutorial uses, but does not require:
• Altova XMLSpy (The free Home Edition will suffice.)
• Microsoft™ Internet Explorer, Version 6.0 or greater
• Mozilla Firefox, Version 1.0.7 or greater

Please note that XSLT documents are XML and are therefore capable of being
edited with any text editor, such as Microsoft Notepad or Vim. It is useful, however, if
your editor has the ability to assist you in making your documents well-formed;
XMLSpy can do this and much more. CSS documents are not well-formed, so
please use whatever text editor you prefer for these.

Section 2. XML Transformations


Transforming XML commonly involves the following technologies:
• XSLT 1.0, which converts an XML document into something else
• XPath 1.0, which searches and traverses an XML document, and also
facilitates math and string operations
• CSS, which formats XML documents

XML instance documents


This tutorial uses the same DVD catalog example XML document as in Part 3 (see
Resources). For convenience, Listing 1 shows this document again. The majority of
the transforms that you'll write for this tutorial will operate upon this document.

Listing 1. XML instance document for the DVD catalog

<?xml version="1.0"?>
<catalog>
<dvd code="_1234567">
<title>Terminator 2</title>
<description>A shape-shifting cyborg is sent back
from the future to kill the leader of the
resistance.</description>
<price>19.95</price>
<year>1991<year>
</dvd>
<dvd code="_7654321">
<title>The Matrix</title>
<price>12.95<price>
<year>1999<year>
</dvd>

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 3 of 43
developerWorks® ibm.com/developerWorks

<dvd code="_2255577" genre="Drama">


<title>Life as a House</title>
<description>When a man is diagnosed with terminal
cancer, he takes custody of his misanthropic
teenage son.</description>
<price>15.95</price>
<year>2001</year>
</dvd>
<dvd code="_7755522" genre="Action">
<title>Raiders of the Lost Ark</title>
<price>14.95</price>
<year>1981</year>
</dvd>
</catalog>

Another XML instance document for a Web site map will be shown later.

Section 3. XSLT
The wonder is that we can see these trees and not wonder more.
-- Ralph Waldo Emerson

It is sometimes said that XML is not a programming language. This would be


completely true if not for XSLT -- itself XML -- which has been shown to be Turing
complete. This means that you can use XSLT to perform any set of calculations that
you might do with any modern computer.

XSLT is basically a system for declaring what should happen when certain element
types are encountered within an XML document. XSLT is not compiled; instead, it --
along with an XML input document -- is interpreted by a stylesheet processor, such
as Xalan or Microsoft XML Core Services (MSXML). You may imagine its usage as
a mathematical function: XSLT( XML ) = output.

Because the word stylesheet appears within the name XSLT, some within the
programming community assume that XSLT has no serious capability as a
programming language -- that it is merely something akin to Cascading Style Sheets
(CSS). Nothing against CSS, but tsk-tsk. Although not as terse as most other
languages -- in large part because it must be well-formed -- XSLT (combined with
XPath, which is its means of searching and traversing XML tree structures and of
performing string and math operations) is capable of rich functionality. Surprisingly
elegant code is possible, as you'll see later when I discuss recursion.

In the sections that follow, you'll see how to employ XSLT and XPath to retrieve data
from XML documents. In most examples, this data will eventually be formatted as
HTML.

Transformations within Web browsers


Although the following code samples assume that your transforms will occur either

XML transformations
Page 4 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

on a server or within an environment such as XMLSpy, the transforms that follow will
also work with minor or no modifications within the latest versions of browsers that
support XSLT, such as Mozilla Firefox and Microsoft Internet Explorer. XML
documents viewed on these browsers require a directive similar to the following,
which you should place in the document prolog, just below the <?xml
version="1.0"?> tag at the top of the XML input document. Aadjust the href
attribute for the appropriate value, which can be absolute or relative:

<?xml-stylesheet type="application/xml" href="http://www.ibm/com/xslt/foo.xslt"?>

The root element


The root, or top-level, element of any XSL transform, under which all other child
nodes nest, is the xsl:stylesheet or xsl:transform element. You can use
either of these, as they mean the same thing. You can write them as either:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

or:

<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

You would then close these tags at the bottom of the document with
</xsl:stylesheet or </xsl:transform>, respectively. The terms stylesheet
and transform will therefore be used interchangeably throughout this tutorial.

Template elements
The xsl:template element contains a set of rules that you apply to specified
elements within an XML input document. Every xsl:stylesheet or
xsl:transform must have at least one xsl:template element. Much of the
richness of XSLT programming comes from the use of multiple templates as logical
modules, each with its own purpose. You can trigger the template elements through
a match attribute or by direct invocation through a name attribute.

XSLT within XMLSpy


Within XMLSpy, you can transform an XML document with XSLT:

1. Load the XML document within XMLSpy.

2. Load the XSLT document within XMLSpy.

3. With the XML document in focus, select the F10 key.

4. To select XSLT document, click the Window button and

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 5 of 43
developerWorks® ibm.com/developerWorks

then click OK.

5. Click OK again so that the transformation results display.

Use of the match attribute is fairly straightforward. When a pattern indicated by the
value of the match attribute is found, a rule is executed. For instance, you could use
the following to indicate each dvd element found in the document with the string
"Another DVD":

<xsl:template match="dvd">Another DVD</xsl:template>

The output of this transform looks like this:

<?xml version="1.0" encoding="UTF-8"?>


<p>Another DVD</p>
<p>Another DVD</p>
<p>Another DVD</p>
<p>Another DVD</p>

Notice that there is one occurrence of the template rule for each dvd element in the
input document. Every time the dvd element is found in the input document, this rule
is invoked.

xsl:apply-templates and xsl:value-of


It would be useful to have template rules that respond to elements other than just the
one for dvd elements. In fact, you could have templates that match other elements
of interest:

<xsl:template match="price"><xsl:value-of select="."/></xsl:template>


<xsl:template match="title"><xsl:value-of select="."/></xsl:template>

Context node
The context node is the element that is currently being examined.
All other elements are referenced (through XPath expressions)
relative to the context node.

The value of the select attribute of the <xsl:value-of/> tag in both cases is a
pattern that gives the text value of the current element being examined, or context
node. The XPath expression denoted in the select attribute values above by "." is
the self axis. The value of the match attribute of the template tag is also an XPath
expression. Here, the context node is set by the template match. Other ways of
setting the context node are to use xsl:apply-templates and xsl:for-each.

The previous two template tags won't give you only the title and price values;
everything else in the XML input document will be displayed as well. To display only
the title and price element values, you need one more template rule:

XML transformations
Page 6 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

<xsl:template match="dvd">
<p>
<xsl:apply-templates select="title"/>- $<xsl:apply-templates select="price"/>
</p>
</xsl:template>

Some minor HTML formatting (a <p/> tag) has been added as well. The output looks
like this:

<?xml version="1.0" encoding="UTF-8"?>


<p>Terminator 2 - $19.95</p>
<p>The Matrix - $12.95</p>
<p>Life as a House - $15.95</p>
<p>Raiders of the Lost Ark - $14.95</p>

Calling templates
A variation on the previous approach that opens up other possibilities is to explicitly
call templates that display the child title and price elements of the dvd element.
In this case, the entire transform looks like this:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">


<xsl:template match="dvd">
<p>
<xsl:call-template name="Title"/>
<xsl:call-template name="Price"/>
</p>
</xsl:template>
<xsl:template name="Title"><xsl:value-of select="title"/> - </xsl:template>
<xsl:template name="Price">$<xsl:value-of select="price"/></xsl:template>
</xsl:stylesheet>

For readability, the hyphen for the title and the dollar sign for the price has been
moved to the named templates, Title and Price. This is now the same output that the
preceding transform created:

<?xml version="1.0" encoding="UTF-8"?>


<p>Terminator 2 - $19.95</p>
<p>The Matrix - $12.95</p>
<p>Life as a House - $15.95</p>
<p>Raiders of the Lost Ark - $14.95</p>

The name attribute of the xsl:call-template tags references templates whose


name attribute has the same value: either "Title" or "Price". A template, which
you invoke through its name attribute instead of through a match attribute, is called
a named template. Named templates allow for modular coding. As you'll see later,
you can use named templates in combination with <xsl:import/> or
<xsl:include/>, which allows for the addition of external files -- allowing
templates to be swapped in and out as needed. Another important use for named

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 7 of 43
developerWorks® ibm.com/developerWorks

templates within XSLT is that they support recursive programming.

Iteration
Continuing with the example input document catalog.xml, let's look at ways to repeat
a template rule across a set of elements. In this case, the context node will be the
root, or catalog, element. Instead of the method shown previously of matching
elements to trigger the enforcement of template rules, you can use iteration with the
<xsl:for-each/> tag.

First, look at how you might use xsl:for-each to loop through each of the dvd
elements in the catalog:

<xsl:template match="catalog">
<html>
<body>
<xsl:for-each select="dvd">
<xsl:call-template name="DVD"/>
<xsl:for-each>
<body>
</html>
</xsl:template>

Notice that some HTML has been added for formatting within a browser. Looking at
the contents of the xsl:for-each tag, you can see that a named template, DVD, is
called.
might then have the following display logic:

<xsl:template name="DVD">
<xsl:variable name="label" select="@code"/>
<p>
<img src="images/{$label}.gif" alt=""/>
<xsl:value-of select="title"/>
</p>
<xsl:template>

Variables within XSLT


It is worth mentioning that xsl:variable is immutable -- that is,
you cannot change its value within the same scope. (There is no
xsl:constant element, although that might have been a more
appropriate name.)

Most new XSLT programmers find this behavior annoying, but there
are benefits to be gained from it. Consensus within the
programming community has it that strictly speaking, XSLT is not a
functional programming language -- however, it does bear some
resemblance to one. One of the characteristics that XSLT shares
with functional languages is that it prohibits side effects in variables.
This allows XSLT programmers to know that once a template
returns values correctly, it can be trusted to do so from then on
since no other templates can change the values of any "variables"
that template is using.

However, even though variables don't change in XSLT, you're able


to declare a variable each time the DVD template is called in the
example, because it constitutes a different scope each time.

XML transformations
Page 8 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

The previous example reveals a few new concepts. First, there is an


xsl:variable named label. Notice that this variable gets the dvd element's
code attribute value, denoted by the "@" prefix. Also, notice that the
xsl:for-each shifts the context node to each dvd element in turn. You can see
this in the previous DVD template, since you're able to directly reference both the
@code attribute and the value of the title element -- both children of the dvd
element. These relative XPath expressions reflect the fact that the xsl:for-each
shifts the context node to each of the dvd elements.

Second, the new label variable is used in the creation of an <img/> tag for the
slick, new HTML formatting. (Assumed here is that the presence of an image within
the images/ directory for each DVD. Each image would then have the DVD's code
attribute value for its file name.) Notice that to get the value of the variable, you put a
dollar sign ("$") in front of it and then wrap it with curly braces ("{" and "}"); this is
similar to using the xsl:value-of element. (Of course, you do not need to create
the variable in this example; you might instead just directly reference the @code
attribute's value within the img tag like this: <img src="images/{@code}.gif"
alt=""/>.)

Putting the pieces together looks like this:

<?xml version="1.0" encoding="UTF-8"?>


<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:template match="catalog">
<html>
<body>
<xsl:for-each select="dvd">
<xsl:call-template name="DVD"/>
</xsl:for-each>
</body>
</html>
</xsl:template>
<xsl:template name="DVD">
<xsl:variable name="label" select="@code"/>
<p>
<img src="images/{$label}.gif" alt=""/>
<xsl:value-of select="title"/>
</p>
<xsl:template>
<xsl:stylesheet>

The preceding transform then produces the following output:

<html>
<body>
<p>
<img alt="" src="images/_1234567.gif">Terminator 2</p>
<p>
<img alt="" src="images/_7654321.gif">The Matrix</p>
<p>
<img alt="" src="images/_2255577.gif">Life as a House</p>
<p>
<img alt="" src="images/_7755522.gif">Raiders of the Lost Ark</p>
</body>
</html>

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 9 of 43
developerWorks® ibm.com/developerWorks

Formatting output
You might have noticed that the previous output no longer has the <?xml?> tag at
the top. This is because of two things -- either one of which would cause the output
to be treated as HTML. First, the tag <xsl:output method="html"/> was
added just after the opening <xsl:stylesheet> tag. The method attribute of
xsl:output typically has either this value (html) or xml. Another useful attribute
of the xsl:output tag is encoding, which defines the character set of the output
and is used when the method attribute value is xml. Another reason why the
<?xml?> tag disappeared is that most stylesheet processors notice when HTML
tags are used, such as <html/> and <body/>, which you included in the matched
template. When this happens, the output method is treated automatically as HTML.

Sorting results
Document order
Document order is the simply the order in which elements appear
within an XML document. Under this scheme, parent elements
come before their children; this makes the root element the first
element in any document.

A useful accessory to xsl:for-each is xsl:sort. (It also works with


xsl:apply-templates.) You may use this tag as the first child of an
xsl:for-each to alter the order of the results from document order to something
else, based on some key. For instance, the xsl:for-each in the previous example
could use xsl:sort to order the list of DVDs by the value of their title attribute:

<xsl:for-each select="dvd">
<xsl:sort select="title"/>
<xsl:call-template name="DVD"/>
</xsl:for-each>

Incorporating the preceding bold text xsl:sort into the previous transform gives
the following output:

<html>
<body>
<p>
<img alt="" src="images/_2255577.gif">Life as a House</p>
<p>
<img alt="" src="images/_7755522.gif">Raiders of the Lost Ark</p>
<p>
<img alt="" src="images/_1234567.gif">Terminator 2</p>
<p>
<img alt="" src="images/_7654321.gif">The Matrix</p>
</body>
</html>

Notice that the DVDs are now arranged alphabetically by title. You may put
additional xsl:sort elements after the first one to provide finer-grained sorts of the
results set.

XML transformations
Page 10 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

Recursion
To implement the previous DVD list transform through recursion migth seem a little
strange because:

• The iterative solution shown using xsl:for-each is more intuitive to


programmers who are accustomed to imperative languages such as the
Java™ programming language
• This particular situation is more suited to a simple for-loop because it
simply requires traversing a set of sibling node and (s.
However, in cases where elements can nest to an unknown number of levels,
recursion is perfect for the task. For instance, a file system -- or as will be seen later,
topic pages nested within in a Web site -- are perfect candidates. For comparison's
sake, the following code shows a recursive solution to the previous problem, sans
sort:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">


<xsl:output method="html"/>
<xsl:template match="catalog">
<html>
<body>
<xsl:call-template name="DVD">
<xsl:with-param name="dvdCount" select="count(dvd)"/>
</xsl:call-template>
</body>
</html>
</xsl:template>
<xsl:template name="DVD">
<xsl:param name="position" select="1"/>
<xsl:param name="dvdCount"/>
<xsl:variable name="label" select="dvd[$position]/@code"/>
<p>
<img src="images/{$label}.gif" alt=""/>
<xsl:value-of select="dvd[$position]/title"/>
</p>
<xsl:if test="$position < $dvdCount">
<xsl:call-template name="DVD">
<xsl:with-param name="position" select="$position+1"/>
<xsl:with-param name="dvdCount" select="$dvdCount"/>
</xsl:call-template>
</xsl:if>
<xsl:template>
</xsl:stylesheet>

This transform yields the same output as the iterative solution:

<html>
<body>
<p>
<img alt="" src="images/_2255577.gif">Life as a House</p>
<p>
<img alt="" src="images/_7755522.gif">Raiders of the Lost Ark</p>
<p>
<img alt="" src="images/_1234567.gif">Terminator 2</p>
<p>
<img alt="" src="images/_7654321.gif">The Matrix</p>
<body>
</html>

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 11 of 43
developerWorks® ibm.com/developerWorks

Tail recursion
Because the recursive call to itself happens at the end of the DVD
template in the example, this kind of recursion is called tail
recursion. Tail recursion can be less computationally expensive
than other kinds of recursion. This is because some stylesheet
processors are able to optimize tail recursion by converting it to
iteration.

Here, the recursion starts within the matched template at the top of the transform
through a call to the DVD template. The DVD template then calls itself until the dvd
element that it is working with is the last one.

A few additional things are worth looking at in this example. For one thing, the
context node is never shifted to any of the dvd elements. This is because
xsl:call-template doesn't change the context node like xsl:for-each does.
Because of this, the variable defined within the DVD template must reference the
current dvd element with the help of an xsl:param element, position, which
holds the position of the dvd element that is currently being examined. Parameters
can pass from a template to the named template that it calls through the
<xsl:with-param/> tag. <xsl:with-param> is always a child element of the
<xsl:call-template/> tag. These parameters are picked up in the called
template through the <xsl:param/> tag, which must always come immediately after
the opening <xsl:template/> tag.

You might notice that the xsl:param, position, inside of the DVD template, has
a default value indicated by the select attribute. For this reason, it is not necessary
to use a corresponding xsl:with-param in the first call to the DVD template from
the matched template. Use of default values for xsl:param tags is a good practice;
it can help to prevent unexpected behaviors within called templates. (Were it not for
the need to contrast the two parameters, the dvdCount parameter would also have
a default value.)

Notice too within the xsl:call-template at the end of the DVD template that the
position parameter is incremented as it is passed to the next instance of the DVD
template. As seen within the test attribute of <xsl:if/>, the values of the two
parameters are used as the basis for the stop condition of the recursion.

The position parameter is also used within the DVD template to indicate which
dvd element is being examined. This is done through the predicate used to qualify
the dvd element within the select attribute of both the xsl:variable and
xsl:value-of tags. Both of these are XPath expressions, which you will explore
more fully later in this tutorial. XPath predicates are noted between square brackets
("[" and "]") and immediately follow the element that they qualify. Notice that in this
case, the position parameter is used within the predicate to indicate the position
of the currently examined dvd element: dvd[$position].

The <xsl:if/> element is pretty straightforward in its usage. It has a condition that
is evaluated within its mandatory test attribute for a Boolean (true or false) result.
In case you wondered, there is no <xsl:else/> or <xsl:elseif/> tag. Instead of those,

XML transformations
Page 12 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

you use <xsl:choose/>, which will describe later. Lastly, notice that the less-than
sign within the test attribute of the <xsl:if/> is XML-escaped (&lt;>). The use of
a non-escaped less-than sign (<) would cause the stylesheet processor to throw an
exception, since this character as well as the closing angle bracket, or greater-than
sign (>), are used to indicate the beginning and ending of XML elements.

Recursion for a Web site map


Now look at an example to which recursion is better suited. The sample XML
document in Listing 2 shows a site map for a fictitious Web site. You need to
construct a transform to render a site map that will start from the upper-most level of
pages and then recursively list all child pages -- and their child pages -- to any
depth. The result of this transform will be an HTML document.

Listing 2. XML instance document for a Web site map

<?xml version="1.0"?>
<site>
<page label="A" href="0.html">
<page label="AA" href="0_0.html">
<page label="AAA" href="0_0_0.html">
<page label="AAAA" href="0_0_0_0.html"/>
<page label="AAAB" href="0_0_0_1.html"/>
<page label="AAAC" href="0_0_0_2.html"/>
</page>
<page label="AAB" href="0_0_1.html"/>
<page label="AAC" href="0_0_2.html"/>
</page>
<page label="AB" href="0_1.html"/>
<page label="AC" href="0_2.html"/>
</page>
<page label="B" href="1.html"/>
<page label="C" href="2.html"/>
</site>

The sample site map in Listing 2 is four levels deep and suggests HTML pages
nested by topic. The three top-level pages have label attribute values A, B, and C.
Page A has three child pages (AA, AB, and AC), and its first child has three child
pages (AAA, AAB, and AAC). Finally, page AAA has child pages (AAAA through
AAAC). This tree needs to be rendered as HTML.

To show the site map, let's build a set of nested HTML unordered lists -- one for
each navigation level. To construct these lists, iterate through one navigation level at
a time, starting from the top level. If you find that a page has child pages, add that
list of child pages after the current page and then continue iterating through the
current navigation level. In
, the algorithm looks like this:

Build List;
Build List {
For each page {
Write page;
If (page has child pages)
Build List;
}

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 13 of 43
developerWorks® ibm.com/developerWorks

This algorithm processes a navigation tree to any number of levels using a small
amount of code (small for XSLT, anyway). The brevity of the XSLT code required
demonstrates how well suited it is to this kind of task. Add a dash of HTML syntax,
and a transform using the previous logic follows. Comments are added to tie it to the
algorithm:

<?xml version="1.0" encoding="UTF-8"?>


<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:template match="site">
<html>
<body>
<!-- Make the initial template call -->
<xsl:call-template name="BuildList"/>
</body>
</html>
</xsl:template>
<xsl:template name="BuildList">
<ul>
<xsl:for-each select="page">
<!-- Write the current page -->
<li>
<a href="{@href}">
<xsl:value-of select="@label"/>
</a>
</li>
<!-- If the page has child pages, -->
<xsl:if test="count(page) > 0">
<!-- Build that list of pages, starting the process again... -->
<xsl:call-template name="BuildList"/>
</xsl:if>
<xsl:for-each>
</ul>
</xsl:template>
</xsl:stylesheet>

Notice that the preceding transform is tail-recursive. Also, if the current page has no
child pages, a stop condition for the recursion is met, and the recursive call to the
BuildList template is not made. The HTML output of this transform then looks like
this:

<html>
<body>
<ul>
<li><a href="0.html">A</a></li>
<ul>
<li><a href="0_0.html">AA</a></li>
<ul>
<li><a href="0_0_0.html">AAA</a><li>
<ul>
<li><a href="0_0_0_0.html">AAAA</a><li>
<li><a href="0_0_0_1.html">AAAB</a><li>
<li><a href="0_0_0_2.html">AAAC</a></li>
</ul>
<li><a href="0_0_1.html">AAB<a></li>
<li><a href="0_0_2.html">AAC</a><li>
<ul>
<li><a href="0_1.html">AB</a></li>
<li><a href="0_2.html">AC</a><li>
</ul>
<li><a href="1.html">B<a></li>

XML transformations
Page 14 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

<li><a href="2.html">C</a><li>
<ul>
</body>
</html>

With the links deactivated,


renders in a browser like this:

• A
• AA
• AAA
• AAAA
• AAAB
• AAAC
• AAB
• AAC
• AB
• AC
• B
• C

Recursion for a site navigation module


You canmake the previous transform into a tree navigation module with the addition
of logic to compare the current page (the context node) to all page elements in the
site map. This is because each page in the Web site requires its own version of the
navigation module, based on its location within the tree. You also need some
JavaScript to control expanding and collapsing sibling page sets. When finished, the
rendered module looks something like the tree widgets in Microsoft Windows
Explorer and in the Eclipse Navigator view.

When initially rendered, the HTML elements that represent all but the top-tier pages
initially have their style's display property set to "none" (more about this in CSS
later). You then use JavaScript logic on the browser to expand the HTML tag
representing each ancestor page of the context node by setting the style display
property value to "block". To make this behavior page-specific, you can assume
that the @href attribute of each page element is unique. You can then use an
xsl:variable to hold the value of the context node's @href attribute. Then, as
each page element receives the context through iteration within each navigation tier,
you can compare its @href attribute to the xsl:variable that holds the @href for
the page being rendered. In this way, links for the current page and all pages above
it are displayed.

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 15 of 43
developerWorks® ibm.com/developerWorks

It is also appropriate to show the child pages of the current page, if there are any.
You can use either the HTML unordered list elements shown previously or HTML
<div/> tags to hold each navigation tier. These elements require unique id
attribute values so that you can programmatically toggle their style properties
between display:none and display:block. If you use <div/> tags, they will
need a non-zero padding-left or margin-left style value to provide indention.
It is left as an exercise to implement these hints. However, with a few images added
for glitz, the transform output can look like the tree navigation widget shown in
Figure 1. The image is the rendered HTML output of a recursive XSLT transform,
which is the basis of most of the hints given. However, instead of nested elements, I
uaws sibling elements with pointer-to-parent relationships implemented through a
pair of attributes. The HTML output shown in Figure 1 was rendered in Internet
Explore 6.0.

Figure 1. Rendered navigation module

In the rendered HTML (not the image shown in Figure 1), clicking on the minus and
plus icons depicted in Figure 1 collapses and expands the subtree beneath them,
while clicking on the node text navigates you to that page. (All nodes in Figure 1 are
shown expanded to reveal the entire tree.) After the destination page is loaded
within the browser, all nodes are initially collapsed so that only the top-tier pages are
shown. Then, the tree is expanded from the current page upward (leftward, really) to
the top tier so that all ancestor pages and the current page are shown. Expanding
the tree to the current page also reveals any child pages it might have.

Stop condition
One last recursion hint: Much like the brakes on a car, the stop
condition is vital -- so understand and code it first.

XML transformations
Page 16 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

Because of the nested nature of XML documents, combined with the fact that
xsl:variable elements are immutable, you'll find it beneficial to think and code in
recursion to solve most problems in XSLT. In general, the trick to doing this is just to
design your templates so that, like the BuildList template, they are general enough to
apply to all cases of the problem at hand. If they are, they can then just call
themselves as needed.

Conditional logic
Returning to the DVD catalog example, suppose that you want to base some aspect
of a report of the DVDs within the catalog on the price of each DVD. Instead of
displaying the price, you might instead want to categorize the DVD by cost. To do
this, you'll use xsl:choose.

Let's modify the iterative solution used previously to show each dvd element by
calling the Price template that you wrote earlier:

<?xml version="1.0" encoding="UTF-8"?>


<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:template match="catalog">
<html>
<body>
<xsl:for-each select="dvd">
<xsl:call-template name="DVD"/>
</xsl:for-each>
</body>
</html>
</xsl:template>
<xsl:template name="DVD">
<p>
<xsl:value-of select="title"/>
<xsl:call-template name="Price"/>
</p>
</xsl:template>
<xsl:template name="Price">$<xsl:value-of select="price"/><xsl:template>
</xsl:stylesheet>

Now, modify the Price template to display one of three adjectives ("Pricey!,"
"Cheap!," and "So so") instead of the dollar amount:

<xsl:template name="Price">
<xsl:choose>
<xsl:when test="price < 15.00"> - Cheap!</xsl:when>
<xsl:when test="price > 19.00"> - Pricey!</xsl:when>
<xsl:otherwise> - So so</xsl:otherwise>
</xsl:choose>
</xsl:template>

Notice that xsl:choose has two child elements: the mandatory xsl:when, which
behaves much like xsl:if, and xsl:otherwise, which is optional. The output of
this transform using xsl:choose looks like the following:

<html>

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 17 of 43
developerWorks® ibm.com/developerWorks

<body>
<p>Terminator 2 - Pricey!</p>
<p>The Matrix - Cheap!</p>
<p>Life as a House - So so<p>
<p>Raiders of the Lost Ark - Cheap!</p>
</body>
</html>

Like most XSLT coders, you'll be less than thrilled because there are no xsl:else and
xsl:else-if elements -- and xsl:choose is so verbose. XSLT 2.0 has a solution for
this, but I won't discuss it in this tutorial. XML in a Nutshell, 3rd Edition (see
Resources) has sections within most chapters that discuss the differences between
versions 1.0 and 2.0 of XSLT and XPath.

Importing and including templates from other files


The two elements xsl:import and xsl:include allow the addition of templates
from other files to your transform. These elements are both allowed only as children
of the <xsl:stylesheet/> or <xsl:transform/> element and must also come
before any other top-level elements. Syntax for these is as follows:

<xsl:import href="URI"/>

and

<xsl:include href="URI"/>

The URI value for the href attributes refers to the path and name of the file
containing the transform that is being added. The URI value can be relative or
absolute. The difference between these two elements is that templates incorporated
through xsl:import are allowed to have name conflicts, in which case those
templates that were imported are ignored. With xsl:include, named templates
belonging to those transforms that are brought in cannot not clash with any in the
importing transform. These files are simply copied into the current transform at the
point of the xsl:include. For both of these elements, circular references between
imported and importing files are not allowed. As with most other things XML, the
nesting of files through xsl:import and xsl:include is allowed.

XSLT summary
XSLT is a capable programming language that also happens to bear the novelty of
being well-formed. As you'll see in the following section, XSLT combined with XPath
can solve pretty much any computational or formatting problem. Tricks to mastering
XSLT include gaining an appreciation for nesting named template calls and
becoming fond of recursion.

XML transformations
Page 18 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

Section 4. XPath
A seed hidden in the heart of an apple is an orchard invisible.
-- Welsh proverb

In the previous section, you saw how to use XPath to reference the elements within
the sample XML documents. For example, you saw that in order to see the elements
within the trees, XPath expressions such as / (the document root), page (the
expected child element type), and .. (the parent of the context node) are used. As
you'll see in the following sections, it is possible to access any part of an XML
document through the appropriate combination of XPath expressions.

XPath also features set, string, and math functions. While these are not necessarily
on par with most other scripting languages such as Perl or JavaScript in terms of
brevity or power, in most cases you can construct XSLT templates such that these
functions -- combined with the appropriate logic -- can accomplish anything you can
do in other languages. This section serves to clarify the XPath expressions seen
within the previous section on XSLT and to elaborate upon a few concepts.

Axes
A key concept within XPath is the notion of axes. An axis within XPath is the set of
nodes that lie above (as in parent and ancestor elements), below (child and
descendant elements), to the left (preceding and preceding-sibling
elements), or to the right (following and following-sibling elements) of the
context node. The preceding and following axes refer to those elements that
occur before and after the context node, respectively, in document order, whereas
preceding-sibling and following-sibling refer to elements that have the
same parent element as the context node. The self axis is the context node itself
and can be combined with two other axes to form ancestor-or-self and
descendant-or-self. Access to any part of an XML document is then ensured
by the addition to these of the attribute and namespace axes.

Abbreviated and unabbreviated syntaxes


Some XPath axes can, but need not, be expressed in an abbreviated syntax that
somewhat resembles UNIX® file system shell syntax. These are the child (indicated
by asterisk (*) or more explicitly with the desired child element type), the parent (..),
the self (.), the attribute (@"), and the descendant-or-self axes(//). The other axes
must use an unabbreviated syntax. You'll see this syntax in some of the examples
that follow.

Location steps and location path

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 19 of 43
developerWorks® ibm.com/developerWorks

Location steps form the basis of a location path. The delimiter for a location step is
the forward slash (/). A location path is a series of concatenated instances of the
axes discussed previously. For example, beginning at the context node, a location
path might go up two levels, find an element specified by a predicate, look down
within that subtree for a particular element, and return one of its attributes:

../../page[starts-with(@label, 'A')]//page[count(ancestor::page) = 3]/@href

XPath functions
Quoting values within XPath
Notice that single quotes (') are used within the previous XPath
expression. This is because this expression would most likely
appear within some XSLT tag as the value of some attribute. The
attribute value would of course be enclosed within double quotes
("), so the use of those within something embedded, such as an
XPath expression, would break the XSLT tag.

You can accomplish string and math operations within XSLT through the functions
provided by XPath. For instance, if you need to add the values of two
xsl:variable elements x and y, you will use:

<xsl:value-of select="$x + $y"/>

This returns a number, as do many other XPath functions. XPath provides functions
for the following math operations: addition (+), subtraction (-), multiplication (*),
division (div), and modulo, or remainder (mod).

XPath functions can also return a Boolean value (true or false). This is important for
conditional logic:

<xsl:if test="$x > $y"> ... </xsl:if>

XPath functions can also return strings:

<xsl:value-of select="concat($x, ' - ', $y)"/>

In this case, the concat() function may accept any number of arguments greater
than one. Another useful string function is normalize-space(string), which
strips leading and trailing whitespace and reduces all internal whitespace character
sequences to one. The following list comprises other XPath string-related functions:
• starts-with(string, string): Returns true if the first argument
starts with the second; otherwise, it returns false. The previous example
demonstrated this.
• contains(string, string): Returns true if the first argument

XML transformations
Page 20 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

contains the second one; otherwise, it returns false.


• substring-before(string, string): Returns a substring of the
first argument that occurs before the second.
• substring-after(string, string): Returns a substring of the first
argument that occurs after the second.
• substring(string, number, number?): Returns a substring of the
first argument that begins at the position indicated by the second
argument, and is of a length indicated by the third argument. Note that the
positions within this function begin at 1, not 0.
• string-length(string?): Returns a number indicating the length of
the optional argument. If no argument is passed, the value of the context
node is used.
• translate(string, string, string): Returns the first argument
with character replacements of the second argument with position-specific
mappings to characters within the third argument.

XPath functions can also return node sets. For instance, within the example
sitemap.xml, you can find the number of page elements that have child page
elements:

<xsl:value-of select="count(//page[page])"/>

Here, a node set result is wrapped by a function that returns a number. The number
returned by the XPath count() function is 3; it counted the page elements with
label values A, AA, and AAA, each of whom have child page elements. Notice that
the previous code uses the descendant axis in its abbreviated syntax (//). When // is
used with nothing before it, a recursive search is done from the root element; such
searches can be expensive within a large document, so please use with care. Within
the square brackets ([ and ]), a predicate has been placed upon these page
elements, requiring that they have child page elements. The content of this predicate
is itself an XPath expression, "page" within the child axis. Notice also that you can
extend this logic to the following expression to get the number of pages that have
child pages -- that also have child pages -- by nesting another predicate:

<xsl:value-of select="count(//page[page[page]])"/>

The result, as you might expect, is 2 -- a count of the page elements A and AA.

Suppose that you want to find how deep a given context node is within its XML tree
(its navigation tier, to use the language of the site map example). You can again use
the count() function, but this time along a different axis:

<xsl:value-of select="count(ancestor::page)"/>

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 21 of 43
developerWorks® ibm.com/developerWorks

The ancestor axis is used to determine the depth of the context node. Its use is
shown here within a template:

<xsl:template match="/">
<xsl:for-each select="//page">
Page: <xsl:value-of select="@label"/>
,level: <xsl:value-of select="count(ancestor::page)"/>
</xsl:for-each>
</xsl:template>

Notice that xsl:for-each is driven by the node set result provided by the results
of its select attribute. The following output results:

<?xml version="1.0" encoding="UTF-8"?>


Page: A, level: 0
Page: AA, level: 1
Page: AAA, level: 2
Page: AAAA, level: 3
Page: AAAB, level: 3
Page: AAAC, level: 3
Page: AAB, level: 2
Page: AAC, level: 2
Page: AB, level: 1
Page: AC, level: 1
Page: B, level: 0
Page: C, level: 0

The select="//page" attribute returns a node set containing all page elements in
document order, starting from the root element. The xsl:for-each then sets the
context node to each of these page elements in turn. Without the ability of
xsl:for-each to set the context node, the expressions within the xsl:value-of
elements will not function.

XPath summary
XPath is crucial to XSLT. It is the means by which math and string operations are
accomplished and that XML input is searched and traversed. XSLT would be blind
and toothless without XPath.

Section 5. CSS
A fool sees not the same tree that a wise man sees.
-- William Blake

Rules, rules, and more rules

XML transformations
Page 22 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

XML is inherently contextual. That is, its tags, their order, and the ways in which they
are nested tell you much about the meaning of the document. Other than their
arrangement, however, there is no information that indicates how the XML content
should be formatted visually (or within other media). The default view of an XML
document within a text editor or browser is to see all of its tags, with no special
visual treatment given to any one element, even the root. But what if you want to
show a big, fat font for certain elements, or to show other elements in a bulleted or
numbered list? What if it makes sense to always position a grouping of information in
the same place on a page, or to make that grouping's background color different
from that of its parent? To do this, you can map presentation rules to elements
through CSS.

CSS presentation rules allow you to separate information (the XML elements) from
its presentation -- such that if you want to, you can apply any one of a variety of
looks to a set of data under different conditions. Please note that CSS cannot
compare to the power of XSLT for styling or transforming XML; neither is it uniformly
interpreted across platforms or Web browsers -- especially in its support for XML
styling, as you will see. CSS is discussed here, however, because this lighter weight
approach might sometimes be all that is required to do the job. CSS is based upon a
fairly simple, non-XML syntax, and is quite easy to use. Within this tutorial, CSS
documents will be referred to simply as "stylesheets."

XML within browsers


Open an XML document with Internet Explore, and you'll see something similar to
Figure 2.

Figure 2. XML document within Internet Explorer 6.0

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 23 of 43
developerWorks® ibm.com/developerWorks

When you click on the plus and minus icons, you collapse and expand elements that
are parents of other elements -- very functional -- but with an appearance that only a
computer-science major could love. Firefox is no better, but at least it offers an
excuse, as Figure 3 shows.

Figure 3. XML document within Firefox

XML transformations
Page 24 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

The message above the tree in Firefox implies that if there were style information
associated with this XML document, you might like it better. Okay, fine. How do you
do this?

Within HTML documents, you can include style information in the following forms:
• Within the HTML tags that use it -- or inline in CSS-speak
• Within a <style/> tag for styles internal to the HTML document
• External in another file and referenced through an HTML <link/> tag
This is news you can use if you write XSL transforms that produce HTML from XML.
Here, however, you're just concerned with infusing the XML with some visual
formatting. The problem is that the XML probably won't be valid if you add inline
style attributes to its elements wherever you like. In addition, the DTD or XML

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 25 of 43
developerWorks® ibm.com/developerWorks

schema for the XML document probably doesn't include a script element, so the
internal method of applying styles won't work either. Besides, both of these
approaches modify the XML, which you don't want to do. The only approach left is to
use external styles -- via one or more files with a .css suffix -- to apply visual
treatments to the XML documents. However, you can't use the HTML <link/> tag (not
unless your schema or DTD allows it). Therefore, you can associate one or more
CSS files with an XML document through a browser directive similar to the following:

<?xml-stylesheet type="text/css" href="catalog.css"?>

Pllace this at the beginning (prolog) of your XML document, just after the directive,
<?xml version="1.0"?>. The stylesheet directive references a CSS file,
catalog.css, which sits in the same directory as the XML file. The value of the
href attribute can also be absolute. You might use other attributes within the
previous directive, one of which is media. Possible values for this include all,
braille, embossed, handheld, print, projection, screen, speech, tty,
and tv.

Thus, you can reference different cascading style sheet rules from the same XML
document. This allows the XML to be formatted for the media within which it is being
viewed (or heard!). You can test for the current media (and thus service it by the
same stylesheet) through the @media rule. The following code provides an example:

@media print {
body { font-size: 10pt; }
}
@media screen {
body { font-size: x-small; }
}

Syntax
The basic grammar for defining a style within a cascading style sheet is: selector
{property: value;}. The selector is the XML element for which you define an
appearance based upon the value of one or more properties. You can group more
than one selector together, each separated by a comma.

You can also use the universal selector, represented by an asterisk (*); this can
apply to everything in the XML document for which don't define an explicit rule. The
property value is expressed in one of a set of allowable units. These units express
color, length, and so on, as appropriate. Normally, whitespace does not matter within
CSS, allowing you to format these files in whatever way makes sense. However, the
value of a property and the units that it quantifies cannot be separated by any
whitespace.

Within CSS, you have some flexibility for how you define selectors. For instance, if
you want to set styling rules for an element title that is the child of a dvd element,
you can use the following:

XML transformations
Page 26 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

dvd title {font-weight: bold;}

Here, it is declared that DVD title element text values will render bold. You don't
have to use this syntax (you might have more simply declared title as the
selector), but it is a useful thing to remember in case you need to restyle an element
that appears in differing contexts (that is, an element appears as the child of one
element here, but maybe also appears as the child of a different element
elsewhere). Figure 4 shows the effect of this in Internet Explore 6.0.

Figure 4. The first pass at CSS for XML

For readability, let's now say that each dvd element gets displayed within its own
paragraph, with a little padding around it:

dvd {
display: block;
padding: 5px;
}

I'll discuss these properties and their values shortly. Now, if you want to highlight
those dvd elements that include a genre attribute, you declare the following rule:

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 27 of 43
developerWorks® ibm.com/developerWorks

dvd[genre] {color: blue;}

Astute readers will notice the happy resemblance of this syntax to XPath predicates.
While Internet Explore 6.0 does not recognize this syntax, Firefox 1.0.7 gives the
rendered output shown in Figure 5.

Figure 5. Attribute predicates within Firefox

Now, our favorite fellow movie collector wants to highlight drama movies with her
preferred color:

dvd[genre="Drama"] {color: purple;}

This rule, along with the more general one regarding the display of dvd elements
that have a genre attribute, are both obeyed, as shown in Figure 6. Again, this
works in Firefox but not Internet Explore 6.0.

Figure 6. More attribute predicates within Firefox

XML transformations
Page 28 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

This looks better, but the price of each DVD is a little hard to read, isn't it? Let's put a
dollar sign in front of each price element value:

price:before {content: "$";}

Let's also enclose the year that the movie was made in parentheses. For this, you
can use the same pseudo element technique used for the price element:

year:before {content: " (";}


year:after {content: ")";}

Figure 7 shows the preceding rules rendered in Firefox.

Figure 7. Pseudo element rendering

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 29 of 43
developerWorks® ibm.com/developerWorks

Internet Explore 6.0 doesn't recognize the selector:before and selector:after pseudo
elements, so its rendered result is not shown.

Let's add some more formatting to what we've already done:

Listing 3. Stylesheet to visually format the XML document for the DVD
collection

/*
** catalog.css
** This stylesheet visually formats our DVD collection XML doc.
*/
catalog {
font-family: Verdana, Arial, Helvetica, Sans-serif;
font-size: x-small;
padding: 25px;
background-color: #DDDDCC;
}
dvd {
display: block;
padding-bottom: 8px;
border-top: 1px solid gray;
border-left: 2px solid #666666;
border-right: 2px solid white;

XML transformations
Page 30 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

border-bottom: 1px solid white;


background-color: #E8E8FF;
}
dvd:first-child {
border-top: 2px solid #666666;
}
dvd:last-child {
border-bottom: 2px solid white;
}
dvd title {
display: block;
font-weight: bold;
}
dvd[genre] {
color: blue;
}
dvd[genre="Drama"] {
color: purple;
}
dvd[genre="Action"] {
display: none;
}
price:before {content: "$";}
year:before {content: " (";}
year:after {content: ")";}

Here, you added a few properties to the catalog element (which is the root
element) to define the entire document. This sets overall rules for the background
color, borders, font-family, and font-size. As you'll see in the rendered
output, these font properties are inherited by all child elements of the catalog
element. Since catalog is the root element, these rules apply to the entire
document. The nested elements can have their own font property rules, which
override those of the parent.

Next, you added some rules to the dvd elements to make them stand out from the
background (the catalog element). You used two different ways to describe the
colors: name and hexadecimal -- I'll briefly discuss these and other ways to describe
colors in CSS later. You used the display property with a value of "block" for the
catalog and dvd elements to get them to show on their own line and to implement
padding. You also experimented with first-child and last-child pseudo
class selectors to give the first and last dvd elements an extra thick border.
(Pseudo class selectors match conditions of elements instead of element names.)
Finally, our favorite fellow movie collector requested that you hide all movies of the
action genre; this was attempted with the display: hidden property and value.
Figure 8 and Figure 9 show how Firefox and Internet Explore 6.0, respectively,
handle this stylesheet.

Figure 8. The stylesheet rendered in Firefox

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 31 of 43
developerWorks® ibm.com/developerWorks

Figure 9. The stylesheet rendered in Internet Explorer 6.0

XML transformations
Page 32 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

Notice that within Firefox, you've succeeded in hiding the DVD with a genre
attribute value of Action (Raiders of the Lost Ark) by setting its display property
to "none". In fact, Firefox nicely executes all the rules in the stylesheet. With
Internet Explore 6.0, however, you don't succeed with much at all. This is especially
true after you applied a background-color property to the catalog element; this
is the gray blob that covers most of the content in Figure 9. As shown here, the
not-very-standards-adherent Internet Explore 6.0 does not come close to Firefox for
CSS support for XML.

Lastly, as seen at the top of the previous stylesheet (Listing 3), you can use
comments within CSS files. These must begin with /* and end with */, just like
those used in C and Java programming.

Color
Color in CSS is typically a red, green, and blue triplet expressed in that order and in

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 33 of 43
developerWorks® ibm.com/developerWorks

one of three different ways:


• Hexadecimal: This method uses hexadecimal numbers from 0 to F,
which is the equivalent of decimal 15. The notation for a hexadecimal
color begins with a hash mark (#) and is followed by two-digit red, green,
and blue color components concatenated. For example, #FF0000
indicates bright red, #000000 indicates black, and #336699 is a dark
turquoise.
• Decimal: This method uses decimal integers from 0 to 255 in a red,
green, and blue triplet notation that is used as follows: rgb(rr,gg,bb).
For example, rgb(255,0,0) indicates bright red.
• Percent: This method is similar to the decimal method, but it uses
percentages from 0 to 100 for each color component instead. For
example, rgb(100%,0%,0%) indicates bright red.
You can also express colors using the color name. Examples include red, white,
blue, and so on.

Display
As seen in the Listing 3, the display property allows you to show an element in
different ways, given the following values:

• block: This causes the selector to appear within a rectangular area. A


line break occurs before and after this content.
• inline: This causes the element text value to appear on the same line
as neighboring elements, with no line break at the beginning or end.
However, there might be internal line breaks if the contents are long
enough. This is the default display behavior.
• none: Not only does this force the element contents to disappear, but the
space allocated to it is removed as well; this is as opposed to the
visibility: hidden property and value, in which the allocated space
remains.
• list: Elements displayed in this manner are used to form an ordered or
unordered list.
Table elements offer some support for CSS alternatives to HTML tables; they can
force elements to behave like HTML table, tr, and td elements if nested
appropriately. Forget it, however, if you need the exact equivalent of HTML table
rowspan or colspan functionality; non-table-styled element nesting will have to
suffice in these cases.

Length
Lengths are used for many CSS properties. Width, height, font size, border, padding,

XML transformations
Page 34 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

margin, and positioning all use some length unit to describe their values. Lengths
are either absolute or relative. Absolute units, which are appropriate for print media
(but not much else), include points, picas, millimeters, centimeters, and inches.
Relative units are more commonly used for screen media and include pixels,
percentages, ems, and exes.

Pixels might at first seem like absolute units, but they are relative to the resolution of
the display device. These are typically used to scale block type elements relative to
bitmapped images, but avoid their use as font sizes. When the pixel unit is used to
describe a font size, it is treated as an absolute unit and most browsers don't allow
users to scale the font size it describes for readability.

Percentages describe a length of something relative to some other object. Often, this
object is the maximum space available. A width property with a value of 60% will
then take up 60% of the space that it could occupy. For instance, you can replace
the rules for the previous catalog element with these rules:

catalog {
font-family: Verdana, Arial, Helvetica, Sans-serif;
font-size: x-small;
background-color: #DDDDCC;
width: 60%;
}

Keeping all other rules in the stylesheet example as they are yields the output within
Firefox, as shown in Figure 10.

Figure 10. An element occupies a percentage width relative to its parent

Here you can see that the catalog element indeed occupies 60% of the space
available, which, in this case, is the browser window -- just as the newly added
width property dictates.

Ems and exes are useful units for specifying font sizes relative to the parent font
size. The trick with using these is to be aware of the size of the comparable attribute

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 35 of 43
developerWorks® ibm.com/developerWorks

of the parent element.

Table 1 summarizes absolute and relative length units.

Table 1. Units of length


Absolute Relative
Unit Description Unit Description
in Inch % Percent:
relative to
length of
corresponding
parent
attribute
pt Point: equals 1/72 of an inch em Em: multiplier
of height of
capital letter
("M") of parent
element
pc Pica: equals 12 points ex Ex: multiplier
of height of
lowercase
letter ("x") of
parent
element
cm Centimeter: equals 0.394 px Pixel: the unit
inches of resolution
of a screen
display.
Considered a
relative unit
because
different
devices have
differing
resolutions
and sizes
mm Millimeter: equals 0.1
centimeters

Font
The font properties govern how a typeface is displayed. Available font properties
include:
• font-family: A list of comma-delimited names of font families, in order
of preference. Example values include Arial, Verdana, Sans-serif,
Times New Roman, and Serif. Font families comprised of more than
one word should be wrapped in quotes as shown. The last family within
the list should always be either Sans-serif or Serif, depending on
what is needed.

XML transformations
Page 36 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

• font-size: Can be either a relative or absolute length unit. In addition to


the absolute and relative units listed in Table 1, the values xx-small,
x-small, small, medium, large, x-large, and "xx-large" are
used. Although these values are sometimes referred to as absolute sizes,
browsers scale them to user preferences.
• font-style: Typically normal (the default) or italic.
• font-weight: Controls whether or not a font is displayed as normal
(the default) or bold. Some browsers offer scales of boldness with
bolder and lighter.

Text
To control attributes of type faces not handled by the font properties, you can use
the text properties. These include:
• color: Describe the value as noted earlier in Color.
• background-color: Sets the color of the area immediately surrounding
the text. You can enlarge this area by the use of the padding:
properties, discussed in Padding.
• text-align: Accepts the values left (the default), right, center, or
justify.
• text-decoration: Typically takes the values none (the default) or
underline. Other possible values include overline or
line-through. (There is also rumor of a blink value, though common
decency dictates that it not be discussed here.)

Background
You can apply background treatments to various inline, block, and table elements
through the background-color property or by use of background images. The
background-color property values conform to the previous color description. The
use of images, however, can be a little more involved. Properties used to describe a
background image include the following:
• background-image: Requires the syntax url(' image URI ') to
locate the image.
• background-position: Indicates the orientation of the image relative
to the element. The syntax for this includes two of the following: left or
right and top or bottom.
• background-repeat: Controls whether the background image is tiled,
and if so, whether it's tiled vertically or horizontally. The syntax for this is
either no-repeat for no tiling, repeat-x for a horizontal tiling, or

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 37 of 43
developerWorks® ibm.com/developerWorks

repeat-y for a vertically tiled image.


The following rules provide an example of this syntax:

background-image: url("../images/tubeTile.gif");
background-position: left top;
background-repeat: repeat-x;

Padding
Padding properties work for inline and block elements. You can specify either a
single padding: property, or you can specify one or more of the four directions:
padding-top:, padding-left:, padding-right:, and padding-bottom:.
You can use the units described earlier in Length. For a single padding: property,
you can still uniquely specify each of the four directions by putting the values in a
particular order, given the number of values. For instance:

• If one length is given, the value applies to all four sides.


• If two lengths are given, the first value describes the top and bottom, and
the second value describes the left and right.
• If three lengths are given, the values describe the top, the left and right,
and the bottom, in that order.
• If four lengths are given, the values describe the top, the right, the bottom,
and the left, in that order.
You can use any combination of length units, but not negative numbers. This
property is not inherited from the parent element.

Borders
Similar to the padding: properties, you can describe border: either all-around for
an element or by one or more of each of the four directions: border-top:,
border-left:, border-right:, and border-bottom:. The value syntax is:
length border-style color. Allowable length units were discussed earlier in Length.
Border-style can be one these values: dashed, double, dotted, groove, inset,
outset, or solid. The following rules provide an example of this syntax:

border-left: 1px solid #EEEEEE;


border-right: 1px solid #999999;

Position
With the CSS position properties, you can tell elements exactly where to display
within a document. For instance, if you want to specify that the first dvd element

XML transformations
Page 38 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

within the catalog appear at a fixed point relative to all other elements, you might use
the following code:

dvd:first-child {
border-top: 2px solid #666666;
position: absolute;
left: 80px;
top: 150px;
}

Notice the position property and its value of absolute. Once this is specified,
you can then describe where you want to absolutely position the element from the
top left of the page. In the previous rule, we declared 80 pixels from the left edge of
the layout and 150 pixels down from the top. Figure 11 shows what this looks like
when rendered in Firefox.

Figure 11. Absolute positioning in Firefox

In Figure 11, Terminator 2 is the first dvd element in document order, so it received
the absolute positioning rule that was declared for the first dvd element. In addition
to the left and top properties, you can declare right and bottom; these place

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 39 of 43
developerWorks® ibm.com/developerWorks

the element relative to the right edge or the bottom edge of the layout, respectively.
You can use any combination of these properties, so long as it makes sense for the
layout. Another value of the position property is relative, which offsets the normal
rendering position of an element by the top, left, right, and bottom properties.
For instance, you might use this code to place the first dvd element above and to
the left of its normal position by 10 pixels each:

dvd:first-child {
border-top: 2px solid #666666;
position: relative;
left: -10px;
top: -10px;
}

Figure 12 shows what this looks like rendered in Firefox.

Figure 12. Relative positioning in Firefox

It is possible to overlay the display of one element upon another. You can control
which element is rendered on top with the z-index: property. Integer values are
used for this, with the element having the greater value placed above the other
elements.

Another absolute way to position elements is through fixed positioning. Whereas


position: absolute sets an element relative to the layout of other elements in

XML transformations
Page 40 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

the page, position: fixed sets an element's position relative to the browser
window. This is hard to show without a scrollable example, but you can try it for
yourself; the syntax is simply position: fixed. The top, left, right, and
bottom properties allow you to set the position where the element will remain, even
as the page is scrolled.

CSS summary
Although not the generally preferred way to style XML documents, and not capable
of doing computation or transformation such as what XSLT can accomplish, CSS at
least offers a lightweight way to format XML elements visually. Because of the
inconsistent support of CSS (especially for XML) within various browsers, use CSS
for screen media, at least,with caution. As shown here, however, XML support for
CSS is strong within Mozilla Firefox. Articles written about CSS for print media have
indicated great success; this is especially useful if you can avoid the greater
complexity of XSLT.

Section 6. Conclusion

Summary
This concludes the discussion of the XML transformation topics XSLT, XPath, and
CSS. It is hoped that this information provides a solid introduction for those who are
new to the subject. An honest effort to understand the material presented here and
examine the references that follow should be more than adequate to prepare you for
the XML transformations portion of Exam 142, XML and Related Technologies.

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 41 of 43
developerWorks® ibm.com/developerWorks

Resources
Learn
• XML in a Nutshell, 3rd Edition (Elliotte Rusty Harold and W. Scott Means,
O'Reilly Media, September 2004, ISBN: 0596007647): In this comprehensive
XML reference, find excellent chapters on XSLT, XPath, and CSS for XML. Also
find great comparisons between versions 1.0 and 2.0 of XSLT and XPath.
• XSLT Cookbook, 2nd Edition (Sal Mangano. O'Reilly, December 2005, ISBN:
0596009747): Dig into detailed examples of XSLT and XPath usage, plus some
more involved applications of these technologies.
• Universal Turing machine: Find out how XSLT is Turing complete in this
Wikipediaarticle.
• Investigating XSLT: The XML transformation language (LindaMay Patterson,
developerWorks, August 2001): Read about basic syntax and XSL
programming techniques in this early article on XSLT and XPath.
• Practical data binding: XPath as a data binding tool, Part 1 (Brett McLaughlin,
developerWorks, November 2005): Explore a good primer on XPath, and add to
what you saw in this tutorial.
• XSLT Transformation: Visit W3Schools for more on basic XSLT syntax and
grammar.
• CSS tutorial: Learn to apply style and layout to multiple Web pages at once
from W3Schools. Most of what you do for HTML also applies to XML.
• CSS Length Units Reference (MSDN): Review supported length units for
Cascading Style Sheets (CSS), text, layout, and positioning properties.
• XML Transformations with CSS and DOM: Visit Apple's Developer Connection
for Mozilla support for XML and CSS.
• XPath string functions: Explore string functions for XPath in this W3C
recommendation.
• The CSS @media rule: Learn how to specify target media types in this W3C
recommendation.
• Document order: Review the definition of document order in the W3C XPath
recommendation.
• IBM XML 1.1 certification: Become an IBM Certified Developer in XML 1.1 and
related technologies.
• XML: See developerWorks XML Zone for a wide range of technical articles and
tips, tutorials, standards, and IBM Redbooks.
• developerWorks technical events and webcasts: Stay current with technology in
these sessions.
Get products and technologies

XML transformations
Page 42 of 43 © Copyright IBM Corporation 1994, 2006. All rights reserved.
ibm.com/developerWorks developerWorks®

• Altova XMLSpy 2006 Home Edition: Download a free entry level XML editor and
development tool for designing and editing XML-based applications.
• Microsoft Internet Explorer 7: Download Internet Explorer 7, and recommended
updates.
• Mozilla Firefox 1.5: Download Firefox 1.5 with its support of open Web
standards.
• IBM product evaluation versions: Download and try application development
tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and
WebSphere®.
Discuss
• XML zone discussion forums: Participate in any of several XML-centered
forums.
• developerWorks blogs: Get involved in the developerWorks community.

About the author


Brian L Brinker
Brian L. Brinker learned about XML transformations by helping to design and build an
XML-based prototyping application, the IBM® WebSphere® Portal Experience
Modeler. Brian works at the IBM Centers for Solution Innovation (Atlanta, Georgia).
He has degrees in physics, computer science, and information systems
management, and he lives with his wife and son in the beautiful Appalachian foothills
of Jasper, Georgia.

Trademarks
IBM, DB2, Lotus, Rational, Tivoli, and WebSphere are trademarks of IBM
Corporation in the United States, other countries, or both.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the
United States, other countries, or both.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft
Corporation in the United States, other countries, or both.

XML transformations
© Copyright IBM Corporation 1994, 2006. All rights reserved. Page 43 of 43

You might also like