0% found this document useful (0 votes)
49 views

Chapter 3-The Client Tier

The document provides an overview of XML (Extensible Markup Language) including its structure, elements, attributes, and usage. It discusses XML's flexibility in describing and exchanging structured data compared to HTML which focuses on display. It also covers XML syntax rules, common components like tags and elements, and additional features such as namespaces, DTDs, and external vs internal DTDs.

Uploaded by

dijon weiland
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

Chapter 3-The Client Tier

The document provides an overview of XML (Extensible Markup Language) including its structure, elements, attributes, and usage. It discusses XML's flexibility in describing and exchanging structured data compared to HTML which focuses on display. It also covers XML syntax rules, common components like tags and elements, and additional features such as namespaces, DTDs, and external vs internal DTDs.

Uploaded by

dijon weiland
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

Chapter 3 - The Client Tier

XML
Extensible Markup Language (XML) is used to describe data. The XML standard is a flexible way
to create information formats and electronically share structured data via the public Internet,
as well as via corporate networks. Structured information contains both content (words,
pictures, etc.) and some indication of what role that content plays (for example, content in a
section heading has a different meaning from content in a footnote, which means something
different than content in a figure caption or content in a database table, etc.). Almost all
documents have some structure.
A markup language is a mechanism to identify structures in a document. The XML specification
defines a standard way to add markup to documents. The basic building block of an XML
document is an element, defined by tags. An element has a beginning and an ending tag. All
elements in an XML document are contained in an outermost element known as the root
element. XML can also support nested elements, or elements within elements. This ability
allows XML to support hierarchical structures. Element names describe the content of the
element, and the structure describes the relationship between the elements.
For example
<?xml version="1.0" standalone="yes"?>
<conversation>
<greeting>Hello, world!</greeting>
<response>Thank you world</response>
</conversation>

XML Usage
A short list of XML usage says it all:
 XML can work behind the scene to simplify the creation of HTML documents for large web
sites.
 XML can be used to exchange the information between organizations and systems.
 XML can be used for offloading and reloading of databases.
 XML can be used to store and arrange the data, which can customize your data handling
needs.
 XML can easily be merged with style sheets to create almost any desired output.
 Virtually, any type of data can be expressed as an XML document.

The Difference between XML and HTML


XML is not a replacement for HTML. XML and HTML were designed with different goals:
 XML was designed to describe data, with focus on what data is
 HTML was designed to display data, with focus on how data looks
HTML is about displaying information, while XML is about carrying information.
XML Declaration
The XML document can optionally have an XML declaration. It is written as follows:
Where version is the XML version and encoding specifies the character encoding used in the document.
Syntax Rules for XML Declaration
 The XML declaration is case sensitive and must begin with "" where "xml" is written in lower-case.
 If the document contains XML declaration, then it strictly needs to be the first statement of the XML
document.
 The XML declaration strictly needs be the first statement in the XML document.
 An HTTP protocol can override the value of encoding that you put in the XML declaration.

Tags and Elements


An XML file is structured by several XML-elements also called XML-nodes or XML-tags. The names of
XML-elements are enclosed in triangular brackets < > as shown below:
<element>
Syntax Rules for Tags and Elements Element Syntax:
Each XML-element needs to be closed either with start or with end elements as shown below:
<element>....</element>
Nesting of Elements: An XML-element can contain multiple XML-elements as its children, but the
children elements must not overlap. i.e., an end tag of an element must have the same name as that of
the most recent unmatched start tag.
Root Element: An XML document can have only one root element. For example, following is not a
correct XML document, because both the x and y elements occur at the top level without a root
element.
Case Sensitivity: The names of XML-elements are case-sensitive. That means the name of the start and
the end elements need to be exactly in the same case. For example,

XML Attribute Values Must be quoted: XML elements can have attributes in name/value pairs just like
in HTML. In XML, the attribute values must always be quoted.
<note date="12/11/2007">
<to>Tove</to>
<from>Jani</from>
</note>

XML Attributes
An attribute specifies a single property for the element, using a name/value pair. An
XMLelement can have one or more attributes. For example:
<item dept="WMN" num="557" quantity="1" color="navy"/>
<a href="http://www.ntc.net/">NTC</a>
Syntax Rules for XML Attributes
Attribute names in XML (unlike HTML) are case sensitive. That is, HREF and href are considered
two different XML attributes.
Same attribute cannot have two values in a syntax.
XML comments are similar to HTML comments. The comments are added as notes or lines for
understanding the purpose of an XML code.
Comments can be used to include related links, information, and terms. They are visible only in the
source code; not in the XML code. Comments may appear anywhere in XML code.
<?xml version = "1.0" encoding = "UTF-8" ?>
<!--Students grades are uploaded by months-->
<class_list>
<student>
<name>Tanmay</name>
<grade>A</grade>
</student>
</class_list>

Tree structure
The tree structure is often referred to as XML Tree and plays an important role to describe any
XML document easily.
The tree structure contains root (parent) elements, child elements and so on. By using tree
structure, you can get to know all succeeding branches and sub-branches starting from the
root. The parsing starts at the root, then moves down the first branch to an element, take the
first branch from there, and so on to the leaf nodes.
<?xml version = "1.0"?>
<Company>
<Employee>
<FirstName>Ram</FirstName>
<LastName>Nepal</LastName>
<ContactNo>1234567890</ContactNo>
<Email>[email protected]</Email>
<Address>
<City>Kathmandu</City>
<State>2</State>
<Zip>44600</Zip>
</Address>
</Employee>
</Company>

A Namespace is a set of unique names. Namespace is a mechanism by which element and


attribute name can be assigned to a group. The Namespace is identified by URI(Uniform
Resource Identifiers).
Namespace Declaration
A Namespace is declared using reserved attributes. Such an attribute name must either
be xmlns or begin with xmlns: shown as below :−
<element xmlns:name = "URL">

Syntax

 The Namespace starts with the keyword xmlns.


 The word name is the Namespace prefix.
 The URL is the Namespace identifier.

<?xml version = "1.0" encoding = "UTF-8"?>


<cont:contact xmlns:cont = "www.example.com/profile">
<cont:name>Ram Nepal</cont:name>
<cont:company>Incode</cont:company>
<cont:phone>01-123456</cont:phone>
</cont:contact>

DTD
The XML Document Type Declaration, commonly known as DTD, is a way to describe XML
language precisely. A DTD allows us to create rules for the elements within your XML
documents. An XML DTD can be either specified inside the document, or it can be kept in a
separate document and then liked separately.
If we created the own XML elements, attributes and /or entities then create the DTD.

Syntax

Basic syntax of a DTD is as follows –


<!DOCTYPE element DTD identifier
[
declaration1
declaration2
........
]>
In the above syntax,
 The DTD starts with <!DOCTYPE delimiter.
 An element tells the parser to parse the document from the specified root element.
 DTD identifier is an identifier for the document type definition, which may be the path
to a file on the system or URL to a file on the internet. If the DTD is pointing to external
path, it is called External Subset.
 The square brackets [ ] enclose an optional list of entity declarations called Internal
Subset.

Why Use a DTD?


With a DTD, independent groups of people can agree on a standard DTD for interchanging data.
An application can use a DTD to verify that XML data is valid.

Internal DTD

A DTD is referred to as an internal DTD if elements are declared within the XML files. To refer it
as internal DTD, standalone attribute in XML declaration must be set to yes. This means, the
declaration works independent of an external source.
Syntax
Following is the syntax of internal DTD −

<!DOCTYPE root-element [element-declarations]>


where root-element is the name of root element and element-declarations is where you
declare the elements.

<?xml version = "1.0" encoding = "UTF-8" standalone = "yes" ?>


<!DOCTYPE address [
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>

<address>
<name>Ram Nepal</name>
<company>Incode</company>
<phone>0123456</phone>
</address>

External DTD
In external DTD elements are declared outside the XML file. They are accessed by specifying the
system attributes which may be either the legal .dtd file or a valid URL. To refer it as external
DTD, standalone attribute in the XML declaration must be set as no. This means, declaration
includes information from the external source.
Syntax
Following is the syntax for external DTD −

<!DOCTYPE root-element SYSTEM "file-name">


where file-name is the file with .dtd extension.

The content of the DTD file address.dtd is as shown −

<!ELEMENT address (name,company,phone)>


<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>

Now xml file,


<?xml version = "1.0" encoding = "UTF-8" standalone = "no" ?>
<!DOCTYPE address SYSTEM "address.dtd">
<address>
<name>Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
</address>
Types
You can refer to an external DTD by using either system identifiers or public identifiers.

System Identifiers
A system identifier enables you to specify the location of an external file containing DTD
declarations. Syntax is as follows −

<!DOCTYPE name SYSTEM "address.dtd" [...]>


As you can see, it contains keyword SYSTEM and a URI reference pointing to the location of the
document.

Public Identifiers
Public identifiers provide a mechanism to locate DTD resources and is written as follows −

<!DOCTYPE name PUBLIC "-//Beginning XML//DTD Address Example//EN">


As you can see, it begins with keyword PUBLIC, followed by a specialized identifier. Public
identifiers are used to identify an entry in a catalog. Public identifiers can follow any format;
however, a commonly used format is called Formal Public Identifiers, or FPIs.

Combined DTD
We can use both an internal DTD and an external one at the same time. This could be useful if
we need to adhere to a common DTD, but also need to define your own definitions locally.

Here using both an external DTD and an internal one for the same XML document. The external
DTD resides in tutorials.dtd and is called first in the DOCTYPE declaration. The internal DTD
follows the external one but still resides within the DOCTYPE declaration:

<?xml version="1.0" standalone="no"?>


<!DOCTYPE tutorials SYSTEM "tutorials.dtd" [
<!ELEMENT tutorial (summary)>
<!ELEMENT summary (#PCDATA)>
]>
<tutorials>
<tutorial>
<name>XML Tutorial</name>
<url>https://www.xyz.com/xml/tutorial</url>
<summary>Best XML!</summary>
</tutorial>
<tutorial>
<name>HTML Tutorial</name>
<url>https://www.xyz.com/html/tutorial</url>
<summary>Best HTML!</summary>
</tutorial>
</tutorials>
New element called summary. This element must be present under the tutorial element.
Because this element hasn't been defined in the external DTD, I need to define it internally.
Once again, we're setting the standalone attribute to no because we rely on an external
resource.

DTD Formal Public Identifier (FPI)


When declaring a DTD available for public use, you need to use the PUBLIC keyword within your
DOCTYPE declaration.

When you use the PUBLIC keyword, you also need to use an FPI (which stands for Formal Public
Identifier).

FPI Syntax
An FPI is made up of 4 fields, each separated by double forward slashes (//):

field 1//field 2//field 3//field 4


FPI Example
Here's a real life example of an FPI. In this case, the DTD was created by the W3C for XHTML:

-//W3C//DTD XHTML 1.0 Transitional//EN


FPI Fields
An FPI must contain the following fields:

Field Example Description


Separator // This is used to separate the different fields of the FPI.
First field - Indicates whether the DTD is connected to a formal standard or not. If
the DTD hasn't been approved (for example, you've defined the DTD
yourself), use a hypen (-). If the DTD has been approved by a
nonstandards body, use a plus sign "+". If the DTD has been approved by
a formal standards body this field should be a reference to the standard
itself.
Second field W3C Holds the name of the group (or person) responsible for the DTD.
The above example is maintained by the W3C, so "W3C" appears in the
second field.
Third field DTD XHTML 1.0 Transitional Indicates the type of document that is being
described. This usually contains some form of unique identifier (such as a
version number).
Fourth field EN Specifies the language that the DTD uses. This is achieved by using
the two letter identifier for the language (i.e. for english, use "EN").
FPI DOCTYPE Syntax
When using a public DTD, place the FPI between the PUBLIC keyword and the URI/URL.

<!DOCTYPE rootname PUBLIC FPI URL>


FPI DOCTYPE Example
You can see an example of an FPI in the following DOCTYPE declaration (the FPI is in bold):

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"


"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

DTD Elements

Creating a DTD is quite straight forward. It's really just a matter of defining your elements,
attributes, and/or entities.

To define an element in DTD, you use the <!ELEMENT> declaration. The actual contents of
your <!ELEMENT> declaration will depend on the syntax rules you need to apply to your
element.

Basic Syntax
The <!ELEMENT> declaration has the following syntax:
<!ELEMENT element_name content_model>

Here, element_name is the name of the element you're defining. The content model could
indicate a specific rule, data or another element.

 If it specifies a rule, it will be set to either ANY or EMPTY.


 If specifies data or another element, the data type/element name needs to be
surrounded by brackets (i.e. (tutorial) or (#PCDATA)).

The following examples show you how to use this syntax for defining your elements.
Plain Text
If an element should contain plain text, you define the element using #PCDATA. PCDATA stands
for Parsed Character Data and is the way you specify non-markup text in your DTDs.

Using this example - <name>XML Tutorial</name> — the XML Tutorial part is the PCDATA. The
other part consists of markup.

Syntax:
<!ELEMENT element_name (#PCDATA)>

Example:
<!ELEMENT name (#PCDATA)>
The above line in your DTD allows the name element to contain non-markup data in your XML
document:
<name>XML Tutorial</name>

Unrestricted Elements
If it doesn't matter what your element contains, you can create an element using the
content_model of ANY. Note that doing this removes all syntax checking, so you should avoid
using this if possible. You're better off defining a specific content model.
Syntax:
<!ELEMENT element_name ANY>

Example:
<!ELEMENT tutorials ANY>

Empty Elements
You might remember that an empty element is one without a closing tag. For example,
in XHTML, the <br /> and <img /> tags are empty elements. Here's how you define an empty
element:
Syntax:
<!ELEMENT element_name EMPTY>

Example:
<!ELEMENT header EMPTY>

The above line in your DTD defines the following empty element for your XML document:
<header />
Child Elements

You can specify that an element must contain another element, by providing the name of the
element it must contain. Here's how you do that:

Syntax:

<!ELEMENT element_name (child_element_name)>

Example:

<!ELEMENT tutorials (tutorial)>

The above line in your DTD allows the tutorials element to contain one instance of the tutorial
element in your XML document:

<tutorials>

<tutorial></tutorial>

</tutorials>

Multiple Child Elements (Sequences)

You can also provide a comma separated list of elements if it needs to contain more than one
element. This is referred to as a sequence. The XML document must contain the tags in the
same order that they're specified in the sequence.

Syntax:

<!ELEMENT element_name (child_element_name, child_element_name,...)>

Example:

<!ELEMENT tutorial (name, url)>

The above line in your DTD allows the tutorial element to contain one instance of the name
element and one instance of the url element in your XML document:

<tutorials>
<tutorial>
<name></name>
<url></url>
</tutorial>
</tutorials>

DTD Element Operators

This article provides an overview of DTD element operators.

This is fine if there only needs one instance of tutorial, but what if we didn't want a limit. What
if the tutorials element should be able to contain any number of tutorial instances? Fortunately
we can do that using DTD operators.

Here's a list of operators/syntax rules we can use when defining child elements:

Operator Syntax Description

+ a+ One or more occurrences of a

* a* Zero or more occurrences of a

? a? Either a or nothing

, a, b a followed by b

| a|b a or b

() (expression) An expression surrounded by parentheses is


treated as a unit and could have any one of the
following suffixes ?, *, or +.

Zero or More

To allow zero or more of the same child element, use an asterisk (*):

Syntax:

<!ELEMENT element_name (child_element_name*)>

Example:

<!ELEMENT tutorials (tutorial*)>


One or More

To allow one or more of the same child element, use a plus sign (+):

Syntax:

<!ELEMENT element_name (child_element_name+)>

Example:

<!ELEMENT tutorials (tutorial+)>

Zero or One

To allow either zero or one of the same child element, use a question mark (?):

Syntax:

<!ELEMENT element_name (child_element_name?)>

Example:

<!ELEMENT tutorials (tutorial?)>

Choices

You can define a choice between one or another element by using the pipe (|) operator. For
example, if the tutorial element requires a child called either name, title, or subject (but only
one of these), you can do the following:

Syntax:

<!ELEMENT element_name (choice_1 | choice_2 | choice_3)>

Example:

<!ELEMENT tutorial (name | title | subject)>

Mixed Content

You can use the pipe (|) operator to specify that an element can contain both PCDATA and
other elements:
Syntax:

<!ELEMENT element_name (#PCDATA | child_element_name)>

Example:

<!ELEMENT tutorial (#PCDATA | name | title | subject)*>

DTD Operators with Sequences

You can apply any of the DTD operators to a sequence:

Syntax:

<!ELEMENT element_name (child_element_name dtd_operator, child_element_name


dtd_operator,...)>

Example:

<!ELEMENT tutorial (name+, url?)>

The above example allows the tutorial element to contain one or more instance of the name
element, and zero or one instance of the url element.

Subsequences

You can use parentheses to create a subsequence (i.e. a sequence within a sequence). This
enables you to apply DTD operators to a subsequence:

Syntax:

<!ELEMENT element_name ((sequence) dtd_operator sequence)>

Example:

<!ELEMENT tutorial ((author, rating?)+ name, url*)>

The above example specifies that the tutorial element can contain one or more author
elements, with each occurrence having an optional rating element.
DTD Attributes
Just as we need to define all elements in your DTD, we also need to define any attributes they
use. We use the <!ATTLIST> declaration to define attributes in DTD.

Syntax

You use a single <!ATTLIST> declaration to declare all attributes for a given element. In other
words, for each element (that contains attributes), you only need one <!ATTLIST> declaration.

The <!ATTLIST> declaration has the following syntax:

<!ATTLIST element_name
attribute_name TYPE DEFAULT_VALUE
attribute_name TYPE DEFAULT_VALUE
attribute_name TYPE DEFAULT_VALUE
...>
Here, element_name refers to the element that you're defining attributes for, attribute_name
is the name of the attribute that you're declaring, TYPE is the attribute type, and
DEFAULT_VALUE is its default value.

Example
<!ATTLIST tutorial
published CDATA "No">
Here, we are defining an attribute called published for the tutorial element. The attribute's type
is CDATA and its default value is No.

DTD Attribute Default Values

We defined an attribute using a default value of No. In this lesson, we look at the various
options for defining default values for your attributes.

Default Values
The attribute TYPE field can be set to one of the following values:

value Description
value A simple text value, enclosed in quotes.
#IMPLIED Specifies that there is no default value for this
attribute, and that the attribute is optional.
#REQUIRED There is no default value for this attribute, but
a a value must be assigned.
#FIXED The #FIXED part specifies that the value must
be the value provided. The value part
represents the actual value.
Examples of these default values follow.

value

You can provide an actual value to be the default value by placing it in quotes.

sayntax:

<!ATTLIST element_name attribute_name CDATA "default_value">

Example:

<!ATTLIST tutorial published CDATA "No">

#REQUIRED

The #REQUIRED keyword specifies that you won't be providing a default value, but that you
require that anyone using this DTD does provide one.

Syntax:

<!ATTLIST element_name attribute_name CDATA #REQUIRED>

Example:

<!ATTLIST tutorial published CDATA #REQUIRED>

#IMPLIED

The #IMPLIED keyword specifies that you won't be providing a default value, and that the
attribute is optional for users of this DTD.

Syntax:

<!ATTLIST element_name attribute_name CDATA #IMPLIED>

Example:

<!ATTLIST tutorial rating CDATA #IMPLIED>

#FIXED

The #FIXED keyword specifies that you will provide value, and that's the only value that can be
used by users of this DTD.
Syntax:

<!ATTLIST element_name attribute_name CDATA #FIXED "value">

Example:

<!ATTLIST tutorial language CDATA #FIXED "EN">

DTD Attribute Types

An overview of the DTD attribute types are:

So far, all our examples for declaring attributes have used the CDATA attribute type. CDATA is probably
the most common attribute type as it allows for plain text to be used for the attribute's value. There
may however, be cases where you need to use a different attribute type.

When setting attributes for your elements, the attribute TYPE field can be set to one of the following
values:

Type Description
CDATA Character Data (text that doesn't contain markup)
ENTITY The name of an entity (which must be declared in
the DTD)
ENTITIES A list of entity names, separated by whitespaces.
(All entities must be declared in the DTD)
Enumerated A list of values. The value of the attribute must be
one from this list.
ID A unique ID or name. Must be a valid XML name.
IDREF Represents the value of an ID attribute of another
element.
IDREFS Represents multiple IDs of elements, separated by
whitespace.
NMTOKEN A valid XML name.
NMTOKENS A list of valid XML names, separated by whitespace
NOTATION A notation name (which must be declared in the
DTD).

CDATA

As with all attribute types, the attribute type of CDATA is placed after the attribute name and
before the default value.

Syntax:

<!ATTLIST element_name
attribute_name CDATA default_value>

Example:

<!ATTLIST mountain

country CDATA "New Zealand">

Valid XML - The following XML document would be valid, as it conforms to the above DTD:

<mountains>

<mountain country="New Zealand">

<name>Mount Cook</name>

</mountain>

<mountain country="Australia">

<name>Cradle Mountain</name>

</mountain>

</mountains>

ENTITY

The attribute type of ENTITY is used for referring to the name of an entity you've declared in your DTD.

Syntax:

<!ATTLIST element_name

attribute_name ENTITY default_value>

Example:

<!ATTLIST mountain

photo ENTITY #IMPLIED>

<!ENTITY mt_cook_1 SYSTEM "mt_cook1.jpg">


Valid XML - The following XML document would be valid, as it conforms to the above DTD:

<mountains>

<mountain photo="mt_cook_1">

<name>Mount Cook</name>

</mountain>

<mountain>

<name>Cradle Mountain</name>

</mountain>

</mountains>

Invalid XML - The following XML document would be invalid. This is because the photo attribute of the
second element contains a value that hasn't been declared as an entity:

<mountains>

<mountain photo="mt_cook_1">

<name>Mount Cook</name>

</mountain>

<mountain photo="None">

<name>Cradle Mountain</name>

</mountain>

</mountains>

ENTITIES

The attribute type of ENTITIES allows you to refer to multiple entity names, separated by a
space.

Syntax:

<!ATTLIST element_name
attribute_name ENTITIES default_value>

Example:

<!ATTLIST mountain

photo ENTITIES #IMPLIED>

<!ENTITY mt_cook_1 SYSTEM "mt_cook1.jpg">

<!ENTITY mt_cook_2 SYSTEM "mt_cook2.jpg">

Valid XML - The following XML document would be valid, as it conforms to the above DTD:

<mountains>

<mountain photo="mt_cook_1 mt_cook_2">

<name>Mount Cook</name>

</mountain>

<mountain>

<name>Cradle Mountain</name>

</mountain>

</mountains>

Invalid XML - The following XML document would be invalid. This is because in the first
element, a comma is being used to separate the two values of the photo attribute (a space
should be separating the two values):

<mountains>

<mountain photo="mt_cook_1,mt_cook_2">

<name>Mount Cook</name>

</mountain>

<mountain>
<name>Cradle Mountain</name>

</mountain>

</mountains>

Enumerated

The enumerated attribute type provides for a list of possible values. This enables the DTD user
to provide one value from the list of possible values.

The values must be surrounded by parentheses, and each value must be separated by a pipe
(|).

Syntax:

<!ATTLIST element_name

attribute_name (value1 | value2 | value3) default_value>

Example:

<!ATTLIST tutorial

published (yes | no) "no">

Valid XML - The following XML document would be valid, as it conforms to the above DTD:

<tutorials>

<tutorial published="yes">

<name>XML Tutorial</name>

</tutorial>

<tutorial published="no">

<name>HTML Tutorial</name>

</tutorial>

<tutorial>

<name>CSS Tutorial</name>
</tutorial>

</tutorials>

Invalid XML - The following XML document would be invalid because the value of the first
attribute does not match one of the options of the ATTLIST declaration:

<tutorials>

<tutorial published="true">

<name>XML Tutorial</name>

</tutorial>

<tutorial published="no">

<name>HTML Tutorial</name>

</tutorial>

<tutorial>

<name>CSS Tutorial</name>

</tutorial>

</tutorials>

ID

The attribute type of ID is used specifically to identify elements.

Because of this, no two elements can contain the same value for attributes of type ID. Also, you
can only give an element one attribute of type ID. The value that is assigned to an attribute of
type ID must be a valid XML name.

Syntax:

<!ATTLIST element_name

attribute_name ID default_value>

Example:
<!ATTLIST mountain

mountain_id ID #REQUIRED>

Valid XML - The following XML document would be valid, as it conforms to the above DTD:

<mountains>

<mountain mountain_id="m10001">

<name>Mount Cook</name>

</mountain>

<mountain mountain_id="m10002">

<name>Cradle Mountain</name>

</mountain>

</mountains>

Invalid XML - The following XML document would be invalid because the value of the
mountain_id attribute is the same for both elements:

<mountains>

<mountain mountain_id="m10001">

<name>Mount Cook</name>

</mountain>

<mountain mountain_id="m10001">

<name>Cradle Mountain</name>

</mountain>

</mountains>

IDREF

The attribute type of IDREF is used for referring to an ID value of another element in the
document.
Syntax:

<!ATTLIST element_name

attribute_name IDREF default_value>

Example:

<!ATTLIST employee

employee_id ID #REQUIRED

manager_id IDREF #IMPLIED>

Valid XML - The following XML document would be valid, as it conforms to the above DTD:

<employees>

<employee employee_id="e10001" manager_id="e10002">

<first_name>Homer</first_name>

<last_name>Flinstone</last_name>

</employee>

<employee employee_id="e10002">

<first_name>Fred</first_name>

<last_name>Burns</last_name>

</employee>

</employees>

Invalid XML - The following XML document would be invalid. This is because the manager_id
attribute of the second element contains a value that isn't the same as a value of another
element that contains an attribute with a type of ID:

<employees>

<employee employee_id="e10001" manager_id="e10002>

<first_name>Homer</first_name>
<last_name>Flinstone</last_name>

</employee>

<employee employee_id="e10002" manager_id="e10003>

<first_name>Fred</first_name>

<last_name>Burns</last_name>

</employee>

</employees>

IDREFS

The attribute type of IDREFS is used for referring to the ID values of more than one other
element in the document. Each value is separated by a space.

Syntax:

<!ATTLIST element_name

attribute_name IDREFS default_value>

Example:

<!ATTLIST individual

individual_id ID #REQUIRED

parent_id IDREFS #IMPLIED>

Valid XML - The following XML document would be valid, as it conforms to the above DTD:

<individuals>

<individual individual_id="e10001" parent_id="e10002 e10003">

<first_name>Bart</first_name>

<last_name>Simpson</last_name>

</individual>
<individual individual_id="e10002">

<first_name>Homer</first_name>

<last_name>Simpson</last_name>

</individual>

<individual individual_id="e10003">

<first_name>Marge</first_name>

<last_name>Simpson</last_name>

</individual>

</individuals>

Invalid XML - The following XML document would be invalid. This is because the manager_id
attribute of the second element contains a value that isn't the same as a value of another
element that contains an attribute with a type of ID:

<employees>

<employee employee_id="e10001" manager_id="e10002>

<first_name>Homer</first_name>

<last_name>Flinstone</last_name>

</employee>

<employee employee_id="e10002" manager_id="e10003>

<first_name>Fred</first_name>

<last_name>Burns</last_name>

</employee>

</employees>

NMTOKEN
An NMTOKEN (name token) is any mixture of Name characters. It cannot contain whitespace
(although leading or trailing whitespace will be trimmed/ignored).

While Names have restrictions on the initial character (the first character of a Name cannot
include digits, diacritics, the full stop and the hyphen), the NMTOKEN doesn't have these
restrictions.

Syntax:

<!ATTLIST element_name

attribute_name NMTOKEN default_value>

Example:

<!ATTLIST mountain

country NMTOKEN #REQUIRED>

Valid XML - The following XML document would be valid, as it conforms to the above DTD:

<mountains>

<mountain country="NZ">

<name>Mount Cook</name>

</mountain>

<mountain country="AU">

<name>Cradle Mountain</name>

</mountain>

</mountains>

Invalid XML - The following XML document would be invalid because the value of the first
attribute contains internal whitespace:

<mountains>

<mountain country="New Zealand">

<name>Mount Cook</name>
</mountain>

<mountain country="Australia">

<name>Cradle Mountain</name>

</mountain>

</mountains>

NMTOKENS

The attribute type of NMTOKENS allows the attribute value to be made up of multiple
NMTOKENSs, separated by a space.

Syntax:

<!ATTLIST element_name

attribute_name NMTOKENS default_value>

Example:

<!ATTLIST mountains

country NMTOKENS #REQUIRED>

Valid XML - The following XML document would be valid, as it conforms to the above DTD:

<mountains country="NZ AU">

<mountain>

<name>Mount Cook</name>

</mountain>

<mountain>

<name>Cradle Mountain</name>

</mountain>

</mountains>
NOTATION

The attribute type of NOTATION allows you to use a value that has been declared as a notation
in the DTD.

A notation is used to specify the format of non-XML data. A common use of notations is to
describe MIME types such as image/gif, image/jpeg etc.

Syntax:

To declare a notation:

<!NOTATION name SYSTEM "external_id">

To declare the attribute:

<!ATTLIST element_name

attribute_name NOTATION default_value>

Example:

<!NOTATION GIF SYSTEM "image/gif">

<!NOTATION JPG SYSTEM "image/jpeg">

<!NOTATION PNG SYSTEM "image/png">

<!ATTLIST mountain

photo ENTITY #IMPLIED

photo_type NOTATION (GIF | JPG | PNG) #IMPLIED>

<!ENTITY mt_cook_1 SYSTEM "mt_cook1.jpg">

In the DTD, we have specified that the value of the photo_type attribute can be one of the
three values supplied. The following XML document would be valid, as it conforms to the above
DTD:

<mountains>

<mountain photo="mt_cook_1" photo_type="JPG">


<name>Mount Cook</name>

</mountain>

<mountain>

<name>Cradle Mountain</name>

</mountain>

</mountains>
XML Schema
XML Schema is an XML-based language used to create XML-based languages and data models.
An XML schema defines element and attribute names for a class of XML documents. The
schema also specifies the structure that those documents must adhere to and the type of
content that each element can hold.

XML documents that attempt to adhere to an XML schema are said to be instances of that
schema. If they correctly adhere to the schema, then they are valid instances. This is not the
same as being well formed. A well-formed XML document follows all the syntax rules of XML,
but it does not necessarily adhere to any particular schema. So, an XML document can be well
formed without being valid, but it cannot be valid unless it is well formed.

As a means of understanding the power of XML Schema, let's look at the limitations of DTD.

- DTDs do not have built-in datatypes.

-DTDs do not support user-derived datatypes.

-DTDs allow only limited control over cardinality (the number of occurrences of an element
within its parent).

-DTDs do not support Namespaces or any simple way of reusing or importing other schemas.

Syntax

You need to declare a schema in your XML document as follows −

Example

The following example shows how to use schema −

<?xml version = "1.0" encoding = "UTF-8"?>

<xs:schema xmlns:xs = "http://www.w3.org/2001/XMLSchema">

<xs:element name = "contact">

<xs:complexType>

<xs:sequence>
<xs:element name = "name" type = "xs:string" />

<xs:element name = "company" type = "xs:string" />

<xs:element name = "phone" type = "xs:int" />

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>

What is XML Schema Used For?

A Schema can be used:

 to provide a list of elements and attributes in a vocabulary;


 to associate types, such as integer, string, etc., or more specifically such as hatsize,
sock_colour, etc., with values found in documents;
 to constrain where elements and attributes can appear, and what can appear inside
those elements, such as saying that a chapter title occurs inside a chapter, and that a
chapter must consist of a chapter title followed by one or more paragraphs of text;
 to provide documentation that is both human-readable and machine-processable;
 to give a formal description of one or more documents.

The following is a high-level overview of Schema types.

 Elements can be of simple type or complex type.


 Simple type elements can only contain text. They cannot have child elements or
attributes.
 All the built-in types are simple types (e.g, xs:string).
 Schema authors can derive simple types by restricting another simple type. For
example, an email type could be derived by limiting a string to a specific pattern.
 Simple types can be atomic (e.g, strings and integers) or non-atomic (e.g, lists).
 Complex-type elements can contain child elements and attributes as well as text.
 By default, complex-type elements have complex content, meaning that they have child
elements.
 Complex-type elements can be limited to having simple content, meaning they only
contain text. They are different from simple type elements in that they have attributes.
 Complex types can be limited to having no content, meaning they are empty, but they
may have attributes.

Elements

As we saw in the XML - previous chapter, elements are the building blocks of XML document.
An element can be defined within an XSD as follows −

<xs:element name = "x" type = "y"/>

Definition Types

You can define XML schema elements in the following ways −

Simple Type

Simple type element is used only in the context of the text. Some of the predefined simple
types are: xs:integer, xs:boolean, xs:string, xs:date. For example −

<xs:element name = "phone_number" type = "xs:int" />

Built-in Simple Types

XML Schema specifies 44 built-in types, 19 of which are primitive.

19 Primitive Data Types


The 19 built-in primitive types are listed below.

String Boolean decimal float double duration dateTime time


date gYearMonth gYear gMonthDay gDay gMonth hexBinary
base64Binary anyURI QName

NOTATION

Built-in Derived Data Types

The other 25 built-in data types are derived from one of the primitive types listed above.

normalizedString token language NMTOKEN NMTOKENS Name NCName


ID IDREF IDREFS ENTITY ENTITIES integer
nonPositiveInteger negativeInteger long int short byte
nonNegativeInteger unsignedLong unsignedInt unsignedShort
unsignedByte positiveInteger

Code Sample:

<?xml version="1.0" ?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Author">

<xs:complexType>

<xs:sequence>

<xs:element name="FirstName" type="xs:string"/>

<xs:element name="LastName" type="xs:string"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>
Notice the FirstName and LastName elements in the code sample above. They are not explicitly
defined as simple type elements. Instead, the type is defined with the type attribute. Because
the value (string in both cases) is a simple type, the elements themselves are simple-type
elements.

Complex Type

A complex type is a container for other element definitions. This allows you to specify which
child elements an element can contain and to provide some structure within your XML
documents. For example −

<xs:element name = "Address">

<xs:complexType>

<xs:sequence>

<xs:element name = "name" type = "xs:string" />

<xs:element name = "company" type = "xs:string" />

<xs:element name = "phone" type = "xs:int" />

</xs:sequence>

</xs:complexType>

</xs:element>

In the above example, Address element consists of child elements. This is a container for other
<xs:element> definitions, that allows to build a simple hierarchy of elements in the XML
document.

XML Attributes:
An attribute provides extra information within an element. Attributes have name and type
properties and are defined within an XSD as follows:

<xs:attribute name="x" type="y" />


Empty Elements
An empty element is an element that contains no content, but it may have attributes. The
HomePage element in the instance document below is an empty element. Below the instance is
the snippet from the Author.xsd schema that declares the HomePage element.
<?xml version="1.0"?>
<Author xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="Aut
hor.xsd">
<Name>
<FirstName>Mark</FirstName>
<LastName>Twain</LastName>
</Name>
<HomePage URL="http://www.marktwain.com"/>
</Author>

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
---- C O D E O M I T T E D ----

<xs:element name="HomePage">
<xs:complexType>
<xs:attribute name="URL" type="xs:anyURI
"/>
</xs:complexType>
</xs:element>
---- C O D E O M I T T E D ----

</xs:schema>

Default and Fixed Values


Default Values
Attributes can have default values. To specify a default value, use the default attribute of the
xs:attribute element. Default values for attributes work slightly differently than they do for
elements. If the attribute is not included in the instance document, the schema processor
inserts it with the default value.
<xs:attribute name="lang" type="xs:string" default="EN"/>
Fixed Values
Attribute values can be fixed, meaning that, if they appear in the instance document, they must
contain a specified value. Like with simple-type elements, this is done with the fixed attribute.
<xs:attribute name="lang" type="xs:string" fixed="EN"/>

Attributes are optional by default. To specify that the attribute is required, use the "use"
attribute:
<xs:attribute name="lang" type="xs:string" use="required"/>
Indicators
The content models are used to indicate the structure and order in which child elements can
appears within their parent element. Content models are made up of model groups. There are
three types of model groups are listed below.

Order indicators:

 All
 Choice
 Sequence

All

The <all> indicator specifies that the child elements can appear in any order, and that each child
element must occur only once:

<xs:element name="person">
<xs:complexType>
<xs:all>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:element>

Choice Indicator

The <choice> indicator specifies that either one child element or another can occur:

<xs:element name="person">
<xs:complexType>
<xs:choice>
<xs:element name="employee" type="employee"/>
<xs:element name="member" type="member"/>
</xs:choice>
</xs:complexType>
</xs:element>

Sequence Indicator

The <sequence> indicator specifies that the child elements must appear in a specific order

<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>

Occurrence Indicators

Occurrence indicators are used to define how often an element can occur.

maxOccurs Indicator

The <maxOccurs> indicator specifies the maximum number of times an element can occur:

<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string" maxOccurs="10"/>
</xs:sequence>
</xs:complexType>
</xs:element>

minOccurs Indicator

The <minOccurs> indicator specifies the minimum number of times an element can occur:

<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string"
maxOccurs="10" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>

The example above indicates that the "child_name" element can occur a minimum of zero
times and a maximum of ten times in the "person" element.

Restrictions on content (facets)

Restrictions are used to define acceptable values for XML elements or attributes. Restrictions
on XML elements are called facets.

The following example defines an element called "age" with a restriction. The value of age
cannot be lower than 0 or greater than 120:

<xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="120"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

Restrictions on a Set of Values

To limit the content of an XML element to a set of acceptable values, we would use the
enumeration constraint.

The example below defines an element called "car" with a restriction. The only acceptable
values are: Audi, Golf, BMW:

<xs:element name="car">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

Restrictions on a Series of Values

To limit the content of an XML element to define a series of numbers or letters that can be
used, we would use the pattern constraint.

The example below defines an element called "letter" with a restriction. The only acceptable
value is ONE of the LOWERCASE letters from a to z:

<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The next example defines an element called "initials" with a restriction. The only acceptable
value is THREE of the UPPERCASE letters from a to z:

<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[A-Z][A-Z][A-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The next example also defines an element called "initials" with a restriction. The only
acceptable value is THREE of the LOWERCASE OR UPPERCASE letters from a to z:

<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-zA-Z][a-zA-Z][a-zA-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The next example defines an element called "choice" with a restriction. The only acceptable
value is ONE of the following letters: x, y, OR z:
<xs:element name="choice">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[xyz]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The next example defines an element called "prodid" with a restriction. The only acceptable
value is FIVE digits in a sequence, and each digit must be in a range from 0 to 9:

<xs:element name="prodid">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:pattern value="[0-9][0-9][0-9][0-9][0-9]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

Restrictions on Whitespace Characters

To specify how whitespace characters should be handled, we would use the whiteSpace
constraint.

This example defines an element called "address" with a restriction. The whiteSpace constraint
is set to "preserve", which means that the XML processor WILL NOT remove any white space
characters:

<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="preserve"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

This example also defines an element called "address" with a restriction. The whiteSpace
constraint is set to "replace", which means that the XML processor WILL REPLACE all white
space characters (line feeds, tabs, spaces, and carriage returns) with spaces:

<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="replace"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

This example also defines an element called "address" with a restriction. The whiteSpace
constraint is set to "collapse", which means that the XML processor WILL REMOVE all white
space characters (line feeds, tabs, spaces, carriage returns are replaced with spaces, leading
and trailing spaces are removed, and multiple spaces are reduced to a single space):

<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="collapse"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

Restrictions on Length

To limit the length of a value in an element, we would use the length, maxLength, and
minLength constraints.

This example defines an element called "password" with a restriction. The value must be
exactly eight characters:

<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

This example defines another element called "password" with a restriction. The value must be
minimum five characters and maximum eight characters:

<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:minLength value="5"/>
<xs:maxLength value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

Restrictions for Datatypes

Constraint Description
enumeration Defines a list of acceptable values
fractionDigits Specifies the maximum number of decimal places allowed. Must be equal to or
greater than zero
length Specifies the exact number of characters or list items allowed. Must be equal to
or greater than zero
maxExclusive Specifies the upper bounds for numeric values (the value must be less than this
value)
maxInclusive Specifies the upper bounds for numeric values (the value must be less than or
equal to this value)
maxLength Specifies the maximum number of characters or list items allowed. Must be
equal to or greater than zero
minExclusive Specifies the lower bounds for numeric values (the value must be greater than
this value)

minInclusive Specifies the lower bounds for numeric values (the value must be greater than
or equal to this value)
minLength Specifies the minimum number of characters or list items allowed. Must be
equal to or greater than zero
pattern Defines the exact sequence of characters that are acceptable
totalDigits Specifies the exact number of digits allowed. Must be greater than zero
whiteSpace Specifies how white space (line feeds, tabs, spaces, and carriage returns) is
handled
XSL
XSL which stands for EXtensible Stylesheet Language. It is similar to XML as CSS is to HTML.

XSL is a language for expressing style sheets. An XSL style sheet is, like with CSS, a file that
describes how to display an XML document of a given type. XSL shares the functionality and is
compatible with CSS2 (although it uses a different syntax). It also adds:

 A transformation language for XML documents: XSLT. Originally intended to perform


complex styling operations, like the generation of tables of contents and indexes, it is
now used as a general purpose XML processing language. XSLT is thus widely used for
purposes other than XSL, like generating HTML web pages from XML data.
 Advanced styling features, expressed by an XML document type which defines a set of
elements called Formatting Objects, and attributes (in part borrowed from CSS2
properties and adding more complex ones.

Need for XSL

In case of HTML document, tags are predefined such as table, div, and span; and the browser
knows how to add style to them and display those using CSS styles. But in case of XML
documents, tags are not predefined. In order to understand and style an XML document, World
Wide Web Consortium (W3C) developed XSL which can act as XML based Stylesheet Language.
An XSL document specifies how a browser should render an XML document.

Following are the main parts of XSL −

 XSLT − used to transform XML document into various other types of document.
 XPath − used to navigate XML document.
 XSL-FO − used to format XML document.

What is XSLT?

XSLT, Extensible Stylesheet Language Transformations, provides the ability to transform XML
data from one format to another automatically.
How XSLT Works?

An XSLT stylesheet is used to define the transformation rules to be applied on the target XML
document. XSLT stylesheet is written in XML format. XSLT Processor takes the XSLT stylesheet
and applies the transformation rules on the target XML document and then it generates a
formatted document in the form of XML, HTML, or text format. This formatted document is
then utilized by XSLT formatter to generate the actual output which is to be displayed to the
end-user.

Advantages

Here are the advantages of using XSLT −

 Independent of programming. Transformations are written in a separate xsl file which is


again an XML document.
 Output can be altered by simply modifying the transformations in xsl file. No need to
change any code. So Web designers can edit the stylesheet and can see the change in
the output quickly.

Let’s suppose we have the following sample XML file, students.xml, which is required to be
transformed into a well-formatted HTML document.

students.xml
<?xml version = "1.0"?>
<class>
<student rollno = "393">
<firstname>Dinkar</firstname>
<lastname>Kad</lastname>
<nickname>Dinkar</nickname>
<marks>85</marks>
</student>
<student rollno = "493">
<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>Vinni</nickname>
<marks>95</marks>
</student>
<student rollno = "593">
<firstname>Jasvir</firstname>
<lastname>Singh</lastname>
<nickname>Jazz</nickname>
<marks>90</marks>
</student>
</class>
We need to define an XSLT style sheet document for the above XML document to meet the following
criteria −

 Page should have a title Students.


 Page should have a table of student details.
 Columns should have following headers: Roll No, First Name, Last Name, Nick Name, Marks
 Table must contain details of the students accordingly.

Step 1: Create XSLT document


Create an XSLT document to meet the above requirements, name it as students.xsl and save it in the
same location where students.xml lies.

students.xsl

<?xml version = "1.0" encoding = "UTF-8"?>


<!-- xsl stylesheet declaration with xsl namespace:
Namespace tells the xlst processor about which element is to be processed and which is used for output
purpose only -->
<xsl:stylesheet version = "1.0"
xmlns:xsl = "http://www.w3.org/1999/XSL/Transform">
<!-- xsl template declaration:
template tells the xlst processor about the section of xml
document which is to be formatted. It takes an XPath expression.
In our case, it is matching document root element and will
tell processor to process the entire document with this template. -->
<xsl:template match = "/">
<!-- HTML tags
Used for formatting purpose. Processor will skip them and browser
will simply render them. -->

<html>
<body>
<h2>Students</h2>

<table border = "1">


<tr bgcolor = "#9acd32">
<th>Roll No</th>
<th>First Name</th>
<th>Last Name</th>
<th>Nick Name</th>
<th>Marks</th>
</tr>

<!-- for-each processing instruction


Looks for each element matching the XPath expression -->
<xsl:for-each select="class/student">
<tr>
<td>
<!-- value-of processing instruction
process the value of the element matching the XPath expression -->
<xsl:value-of select = "@rollno"/>
</td>

<td><xsl:value-of select = "firstname"/></td>


<td><xsl:value-of select = "lastname"/></td>
<td><xsl:value-of select = "nickname"/></td>
<td><xsl:value-of select = "marks"/></td>

</tr>
</xsl:for-each>

</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

Step 2: Link the XSLT Document to the XML Document


Update student.xml document with the following xml-stylesheet tag. Set href value to students.xsl

<?xml version = "1.0"?>


<?xml-stylesheet type = "text/xsl" href = "students.xsl"?>
<class>
...
</class>

Step 3: View the XML Document in Internet Explorer


students.xml

<?xml version = "1.0"?>


<?xml-stylesheet type = "text/xsl" href = "students.xsl"?>
<class>
<student rollno = "393">
<firstname>Dinkar</firstname>
<lastname>Kad</lastname>
<nickname>Dinkar</nickname>
<marks>85</marks>
</student>
<student rollno = "493">
<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>Vinni</nickname>
<marks>95</marks>
</student>
<student rollno = "593">
<firstname>Jasvir</firstname>
<lastname>Singh</lastname>
<nickname>Jazz</nickname>
<marks>90</marks>
</student>
</class>

a. <xsl:template> element

<xsl:template> defines a way to reuse templates in order to generate the desired output for
nodes of a particular type/context.

Declaration

Following is the syntax declaration of <xsl:template> element.


<xsl:template
name = Qname
match = Pattern
priority = number
mode = QName >
</xsl:template>

Attributes

Sr.No Name & Description

1 name
Name of the element on which template is to be applied.

2 match
Pattern which signifies the element(s) on which template is to be applied.

3 priority
Priority number of a template. Matching template with low priority is not considered in from
in front of high priority template.

4 mode
Allows element to be processed multiple times to produce a different result each time.

Elements

Number of Unlimited
occurrences

Parent
xsl:stylesheet, xsl:transform
elements

xsl:apply-imports,xsl:apply-templates,xsl:attribute, xsl:call-template, xsl:choose,


Child xsl:comment, xsl:copy, xsl:copy-of, xsl:element, xsl:fallback, xsl:for-each, xsl:if,
elements xsl:message, xsl:number, xsl:param, xsl:processing-instruction, xsl:text, xsl:value-of,
xsl:variable, output elements

b. <xsl:value-of> tag puts the value of the selected node as per XPath expression, as text.

Declaration
Following is the syntax declaration of <xsl:value-of> element.

<xsl:value-of
select = Expression
disable-output-escaping = "yes" | "no" >
</xsl:value-of>

Attributes
Sr.No Name & Description
1 Select: XPath Expression to be evaluated in current context.

2 disable-output escaping: Default-"no". If "yes", output text will not escape xml characters from
text.
c. <xsl:for-each> tag applies a template repeatedly for each node.

Declaration
Following is the syntax declaration of <xsl:for-each> element

<xsl:for-each
select = Expression >
</xsl:for-each>

Attributes

Sr.No Name & Description

1 Select XPath Expression to be evaluated in current context to determine the set of


nodes to be iterated.

This example creates a table of <student> element with its attribute rollno and its child
<firstname>,<lastname><nickname> and <marks> by iterating over each student.

students.xml

<?xml version = "1.0"?>


<?xml-stylesheet type = "text/xsl" href = "students.xsl"?>
<class>
<student rollno = "393">
<firstname>Dinkar</firstname>
<lastname>Kad</lastname>
<nickname>Dinkar</nickname>
<marks>85</marks>
</student>
<student rollno = "493">
<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>Vinni</nickname>
<marks>95</marks>
</student>
<student rollno = "593">
<firstname>Jasvir</firstname>
<lastname>Singh</lastname>
<nickname>Jazz</nickname>
<marks>90</marks>
</student>
</class>
students.xsl

<?xml version = "1.0" encoding = "UTF-8"?>


<xsl:stylesheet version = "1.0"
xmlns:xsl = "http://www.w3.org/1999/XSL/Transform">
<xsl:template match = "/">
<html>
<body>
<h2>Students</h2>
<table border = "1">
<tr bgcolor = "#9acd32">
<th>Roll No</th>
<th>First Name</th>
<th>Last Name</th>
<th>Nick Name</th>
<th>Marks</th>
</tr>

<xsl:for-each select = "class/student">

<tr>
<td><xsl:value-of select = "@rollno"/></td>
<td><xsl:value-of select = "firstname"/></td>
<td><xsl:value-of select = "lastname"/></td>
<td><xsl:value-of select = "nickname"/></td>
<td><xsl:value-of select = "marks"/></td>
</tr>
</xsl:for-each>

</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

d. <xsl:sort> tag specifies a sort criteria on the nodes.

Declaration
Following is the syntax declaration of <xsl:sort> element.

<xsl:sort
select = string-expression
lang = { nmtoken }
data-type = { "text" | "number" | QName }
order = { "ascending" | "descending" }
case-order = { "upper-first" | "lower-first" } >
</xsl:sort>

Attributes

S.No Name & Description


1 select: Sorting key of the node.

2 lang: Language alphabet used to determine sort order.

3 data-type: Data type of the text.

4 order: Sorting order. Default is "ascending".

5 case-order: Sorting order of string by capitalization. Default is "upper-first".

e. <xsl:if> tag specifies a conditional test against the content of nodes.

Declaration
Following is the syntax declaration of <xsl:if> element.

<xsl:if
test = boolean-expression >
</xsl:if>
Attributes
Sr.No Name & Description
1 test: The condition in the xml data to test.

f. <xsl:choose> tag specifies a multiple conditional tests against the content of nodes in
conjunction with the <xsl:otherwise> and <xsl:when> elements.

Declaration
Following is the syntax declaration of <xsl:choose> element.

<xsl:choose >
</xsl:choose>
XQuery is a query-based language to retrieve data stored in the form of XML. XQuery is to XML
what SQL is to a database.

What is XQuery
XQuery is a functional language that is used to retrieve information stored in XML format.
XQuery can be used on XML documents, relational databases containing data in XML formats,
or XML Databases. XQuery 3.0 is a W3C recommendation from April 8, 2014.
“XQuery is a standardized language for combining documents, databases, Web pages and almost anything
else. It is very widely implemented. It is powerful and easy to learn. XQuery is replacing proprietary
middleware languages and Web Application development languages. XQuery is replacing complex Java or C++
programs with a few lines of code. XQuery is simpler to work with and easier to maintain than many other
alternatives.”

Characteristics
 Functional Language − XQuery is a language to retrieve/querying XML based data.
 Analogous to SQL − XQuery is to XML what SQL is to databases.
 XPath based − XQuery uses XPath expressions to navigate through XML documents.
 Universally accepted − XQuery is supported by all major databases.
 W3C Standard − XQuery is a W3C standard.

Benefits of XQuery
 Using XQuery, both hierarchical and tabular data can be retrieved.
 XQuery can be used to query tree and graphical structures.
 XQuery can be directly used to query webpages.
 XQuery can be directly used to build webpages.
 XQuery can be used to transform xml documents.
 XQuery is ideal for XML-based databases and object-based databases. Object databases are much
more flexible and powerful than purely tabular databases.

books.xml

<?xml version="1.0" encoding="UTF-8"?>


<books>

<book category="JAVA">

<title lang="en">Learn Java in 24 Hours</title>

<author>Robert</author>

<year>2005</year>

<price>30.00</price>

</book>

<book category="DOTNET">

<title lang="en">Learn .Net in 24 hours</title>

<author>Peter</author>

<year>2011</year>

<price>40.50</price>

</book>

<book category="XML">

<title lang="en">Learn XQuery in 24 hours</title>

<author>Robert</author>

<author>Peter</author>

<year>2013</year>

<price>50.00</price>

</book>

<book category="XML">

<title lang="en">Learn XPath in 24 hours</title>

<author>Jay Ban</author>

<year>2010</year>
<price>16.50</price>

</book>

</books>

books.xqy

for $x in doc("books.xml")/books/book

where $x/price>30

return $x/title

Output

You'll get the following result −

<title lang="en">Learn .Net in 24 hours</title>

<title lang="en">Learn XQuery in 24 hours</title>

XPath
XPath is a query language that is used for traversing through an XML document. It is used
commonly to search particular elements or attributes with matching patterns.
This tutorial explains the basics of XPath. It contains chapters discussing all the basic
components of XPath with suitable examples.

What is XPath?
XPath is an official recommendation of the World Wide Web Consortium (W3C). It defines a
language to find information in an XML file. It is used to traverse elements and attributes of an
XML document. XPath provides various types of expressions which can be used to enquire
relevant information from the XML document.
 Structure Definitions − XPath defines the parts of an XML document like element, attribute, text,
namespace, processing-instruction, comment, and document nodes
 Path Expressions − XPath provides powerful path expressions select nodes or list of nodes in XML
documents.
 Standard Functions − XPath provides a rich library of standard functions for manipulation of string
values, numeric values, date and time comparison, node and QName manipulation, sequence
manipulation, Boolean values etc.
 Major part of XSLT − XPath is one of the major elements in XSLT standard and is must have
knowledge in order to work with XSLT documents.
 W3C recommendation − XPath is an official recommendation of World Wide Web Consortium
(W3C).

One should keep the following points in mind, while working with XPath −

 XPath is core component of XSLT standard.


 XSLT cannot work without XPath.
 XPath is basis of XQuery and XPointer.

An XPath expression generally defines a pattern in order to select a set of nodes. These patterns are
used by XSLT to perform transformations or by XPointer for addressing purpose.

XPath specification specifies seven types of nodes which can be the output of execution of the XPath
expression.

Root

Element

Text

Attribute

Comment

Processing Instruction

Namespace

XPath uses a path expression to select node or a list of nodes from an XML document.

Following is the list of useful paths and expression to select any node/ list of nodes from an XML
document.

S.No. Expression Description


1 node-name Select all nodes with the given name "nodename"

2 / Selection starts from the root node


3 // Selection starts from the current node that match the selection

4 . Selects the current node

5 .. Selects the parent of the current node

6 @ Selects attributes

7 student Example − Selects all nodes with the name "student”

8 class/student Example − Selects all student elements that are children of class

9 //student Selects all student elements no matter where they are in the document

students.xml

<?xml version = "1.0"?>

<?xml-stylesheet type = "text/xsl" href = "students.xsl"?>

<class>

<student rollno = "393">

<firstname>Dinkar</firstname>

<lastname>Kad</lastname>

<nickname>Dinkar</nickname>

<marks>85</marks>

</student>

<student rollno = "493">

<firstname>Vaneet</firstname>

<lastname>Gupta</lastname>

<nickname>Vinni</nickname>

<marks>95</marks>
</student>

<student rollno = "593">

<firstname>Jasvir</firstname>

<lastname>Singh</lastname>

<nickname>Jazz</nickname>

<marks>90</marks>

</student>

</class>

students.xsl

<?xml version = "1.0" encoding = "UTF-8"?>

<xsl:stylesheet version = "1.0"

xmlns:xsl = "http://www.w3.org/1999/XSL/Transform">

<xsl:template match = "/">

<html>

<body>

<h2>Students</h2>

<table border = "1">

<tr bgcolor = "#9acd32">

<th>Roll No</th>

<th>First Name</th>

<th>Last Name</th>

<th>Nick Name</th>

<th>Marks</th>
</tr>

<xsl:for-each select = "class/student">

<tr>

<td> <xsl:value-of select = "@rollno"/></td>

<td><xsl:value-of select = "firstname"/></td>

<td><xsl:value-of select = "lastname"/></td>

<td><xsl:value-of select = "nickname"/></td>

<td><xsl:value-of select = "marks"/></td>

</tr>

</xsl:for-each>

</table>

</body>

</html>

</xsl:template>

</xsl:stylesheet>

SAX
SAX, also known as the Simple API for XML, is used for parsing XML documents.
SAX is an API used to parse XML documents. It is based on events generated while reading
through the document. Callback methods receive those events. A custom handler contains
those callback methods.
The API is efficient because it drops events right after the callbacks received them.
Therefore, SAX has efficient memory management, unlike DOM, for example.

SAX vs DOM
DOM stands for Document Object Model. The DOM parser does not rely on events. Moreover,
it loads the whole XML document into memory to parse it. SAX is more memory-efficient than
DOM.
DOM has its benefits, too. For example, DOM supports XPath. It makes it also easy to operate
on the whole document tree at once since the document is loaded into memory.

DOM
The Document Object Model (DOM) is an application programming interface (API) for HTML
and XML documents. It defines the logical structure of documents and the way a document is
accessed and manipulated.

DOM defines the objects and properties and methods (interface) to access all XML elements. It
is separated into 3 different parts / levels −
 Core DOM − standard model for any structured document
 XML DOM − standard model for XML documents
 HTML DOM − standard model for HTML documents
XML DOM is a standard object model for XML. XML documents have a hierarchy of
informational units called nodes; DOM is a standard programming interface of describing those
nodes and the relationships between them.
As XML DOM also provides an API that allows a developer to add, edit, move or remove nodes
at any point on the tree in order to create an application.

Advantages of XML DOM

The following are the advantages of XML DOM.


 XML DOM is language and platform independent.
 XML DOM is traversable - Information in XML DOM is organized in a hierarchy which
allows developer to navigate around the hierarchy looking for specific information.
 XML DOM is modifiable - It is dynamic in nature providing the developer a scope to
add, edit, move or remove nodes at any point on the tree.

Disadvantages of XML DOM

 It consumes more memory (if the XML structure is large) as program written once
remains in memory all the time until and unless removed explicitly.
 Due to the extensive usage of memory, its operational speed, compared to SAX is
slower.
Now that we know what DOM means, let's see what a DOM structure is. A DOM document is a
collection of nodes or pieces of information, organized in a hierarchy. Some types
of nodes may have child nodes of various types and others are leaf nodes that cannot have
anything under them in the document structure. Following is a list of the node types, with a
list of node types that they may have as children −
 Document − Element (maximum of one), ProcessingInstruction, Comment,
DocumentType (maximum of one)
 DocumentFragment − Element, ProcessingInstruction, Comment, Text, CDATASection,
EntityReference
 EntityReference − Element, ProcessingInstruction, Comment, Text, CDATASection,
EntityReference
 Element − Element, Text, Comment, ProcessingInstruction, CDATASection,
EntityReference
 Attr − Text, EntityReference
 ProcessingInstruction − No children
 Comment − No children
 Text − No children
 CDATASection − No children
 Entity − Element, ProcessingInstruction, Comment, Text, CDATASection,
EntityReference
 Notation − No children

Example

Consider the DOM representation of the following XML document node.xml.

<?xml version = "1.0"?>


<Company>
<Employee category = "technical">
<FirstName>Tanmay</FirstName>
<LastName>Patil</LastName>
<ContactNo>1234567890</ContactNo>
</Employee>

<Employee category = "non-technical">


<FirstName>Taniya</FirstName>
<LastName>Mishra</LastName>
<ContactNo>1234667898</ContactNo>
</Employee>
</Company>

The most common types of nodes in XML are −


 Document Node − Complete XML document structure is a document node.
 Element Node − Every XML element is an element node. This is also the only type of
node that can have attributes.
 Attribute Node − Each attribute is considered an attribute node. It contains information
about an element node, but is not actually considered to be children of the element.
 Text Node − The document texts are considered as text node. It can consist of more
information or just white space.
Some less common types of nodes are −
 CData Node − This node contains information that should not be analyzed by the
parser. Instead, it should just be passed on as plain text.
 Comment Node − This node includes information about the data, and is usually ignored
by the application.
 Processing Instructions Node − This node contains information specifically aimed at the
application.
 Document Fragments Node
 Entities Node
 Entity reference nodes
 Notations Node

DOM as an API contains interfaces that represent different types of information that can be
found in an XML document, such as elements and text. These interfaces include the methods
and properties necessary to work with these objects. Properties define the characteristic of
the node whereas methods give the way to manipulate the nodes.
Following table lists the DOM classes and interfaces −

S.No. Interface & Description

1 DOMImplementation
It provides a number of methods for performing operations that are independent of any
particular instance of the document object model.

DocumentFragment
2 It is the "lightweight" or "minimal" document object, and it (as the superclass of
Document) anchors the XML/HTML tree in a full-fledged document.

Document
3 It represents the XML document's top-level node, which provides access to all the nodes in
the document, including the root element.

Node
4
It represents XML node.

NodeList
5
It represents a read-only list of Node objects.

NamedNodeMap
6
It represents collections of nodes that can be accessed by name.

Data
7 It extends Node with a set of attributes and methods for accessing character data in the
DOM.

Attribute
8
It represents an attribute in an Element object.

Element
9
It represents the element node. Derives from Node.

Text
10
It represents the text node. Derives from CharacterData.
Comment
11
It represents the comment node. Derives from CharacterData.

ProcessingInstruction
12 It represents a "processing instruction". It is used in XML as a way to keep processor-
specific information in the text of the document.

CDATA Section
13
It represents the CDATA Section. Derives from Text.

Entity
14
It represents an entity. Derives from Node.

EntityReference
15
This represent an entity reference in the tree. Derives from Node.

Parser

A parser is a software application that is designed to analyze a document, in our case XML
document and do something specific with the information. Some of the DOM based parsers
are listed in the following table −

S.No Parser & Description

1 JAXP
Sun Microsystem’s Java API for XML Parsing (JAXP)
2 XML4J
IBM’s XML Parser for Java (XML4J)

3 msxml
Microsoft’s XML parser (msxml) version 2.0 is built-into Internet Explorer 5.5

4 4DOM
4DOM is a parser for the Python programming language

5 XML::DOM
XML::DOM is a Perl module to manipulate XML documents using Perl

6 Xerces
Apache’s Xerces Java Parser

You might also like