Chapter 3-The Client Tier
Chapter 3-The Client Tier
XML
Extensible Markup Language (XML) is used to describe data. The XML standard is a flexible way
to create information formats and electronically share structured data via the public Internet,
as well as via corporate networks. Structured information contains both content (words,
pictures, etc.) and some indication of what role that content plays (for example, content in a
section heading has a different meaning from content in a footnote, which means something
different than content in a figure caption or content in a database table, etc.). Almost all
documents have some structure.
A markup language is a mechanism to identify structures in a document. The XML specification
defines a standard way to add markup to documents. The basic building block of an XML
document is an element, defined by tags. An element has a beginning and an ending tag. All
elements in an XML document are contained in an outermost element known as the root
element. XML can also support nested elements, or elements within elements. This ability
allows XML to support hierarchical structures. Element names describe the content of the
element, and the structure describes the relationship between the elements.
For example
<?xml version="1.0" standalone="yes"?>
<conversation>
<greeting>Hello, world!</greeting>
<response>Thank you world</response>
</conversation>
XML Usage
A short list of XML usage says it all:
XML can work behind the scene to simplify the creation of HTML documents for large web
sites.
XML can be used to exchange the information between organizations and systems.
XML can be used for offloading and reloading of databases.
XML can be used to store and arrange the data, which can customize your data handling
needs.
XML can easily be merged with style sheets to create almost any desired output.
Virtually, any type of data can be expressed as an XML document.
XML Attribute Values Must be quoted: XML elements can have attributes in name/value pairs just like
in HTML. In XML, the attribute values must always be quoted.
<note date="12/11/2007">
<to>Tove</to>
<from>Jani</from>
</note>
XML Attributes
An attribute specifies a single property for the element, using a name/value pair. An
XMLelement can have one or more attributes. For example:
<item dept="WMN" num="557" quantity="1" color="navy"/>
<a href="http://www.ntc.net/">NTC</a>
Syntax Rules for XML Attributes
Attribute names in XML (unlike HTML) are case sensitive. That is, HREF and href are considered
two different XML attributes.
Same attribute cannot have two values in a syntax.
XML comments are similar to HTML comments. The comments are added as notes or lines for
understanding the purpose of an XML code.
Comments can be used to include related links, information, and terms. They are visible only in the
source code; not in the XML code. Comments may appear anywhere in XML code.
<?xml version = "1.0" encoding = "UTF-8" ?>
<!--Students grades are uploaded by months-->
<class_list>
<student>
<name>Tanmay</name>
<grade>A</grade>
</student>
</class_list>
Tree structure
The tree structure is often referred to as XML Tree and plays an important role to describe any
XML document easily.
The tree structure contains root (parent) elements, child elements and so on. By using tree
structure, you can get to know all succeeding branches and sub-branches starting from the
root. The parsing starts at the root, then moves down the first branch to an element, take the
first branch from there, and so on to the leaf nodes.
<?xml version = "1.0"?>
<Company>
<Employee>
<FirstName>Ram</FirstName>
<LastName>Nepal</LastName>
<ContactNo>1234567890</ContactNo>
<Email>[email protected]</Email>
<Address>
<City>Kathmandu</City>
<State>2</State>
<Zip>44600</Zip>
</Address>
</Employee>
</Company>
Syntax
DTD
The XML Document Type Declaration, commonly known as DTD, is a way to describe XML
language precisely. A DTD allows us to create rules for the elements within your XML
documents. An XML DTD can be either specified inside the document, or it can be kept in a
separate document and then liked separately.
If we created the own XML elements, attributes and /or entities then create the DTD.
Syntax
Internal DTD
A DTD is referred to as an internal DTD if elements are declared within the XML files. To refer it
as internal DTD, standalone attribute in XML declaration must be set to yes. This means, the
declaration works independent of an external source.
Syntax
Following is the syntax of internal DTD −
<address>
<name>Ram Nepal</name>
<company>Incode</company>
<phone>0123456</phone>
</address>
External DTD
In external DTD elements are declared outside the XML file. They are accessed by specifying the
system attributes which may be either the legal .dtd file or a valid URL. To refer it as external
DTD, standalone attribute in the XML declaration must be set as no. This means, declaration
includes information from the external source.
Syntax
Following is the syntax for external DTD −
System Identifiers
A system identifier enables you to specify the location of an external file containing DTD
declarations. Syntax is as follows −
Public Identifiers
Public identifiers provide a mechanism to locate DTD resources and is written as follows −
Combined DTD
We can use both an internal DTD and an external one at the same time. This could be useful if
we need to adhere to a common DTD, but also need to define your own definitions locally.
Here using both an external DTD and an internal one for the same XML document. The external
DTD resides in tutorials.dtd and is called first in the DOCTYPE declaration. The internal DTD
follows the external one but still resides within the DOCTYPE declaration:
When you use the PUBLIC keyword, you also need to use an FPI (which stands for Formal Public
Identifier).
FPI Syntax
An FPI is made up of 4 fields, each separated by double forward slashes (//):
DTD Elements
Creating a DTD is quite straight forward. It's really just a matter of defining your elements,
attributes, and/or entities.
To define an element in DTD, you use the <!ELEMENT> declaration. The actual contents of
your <!ELEMENT> declaration will depend on the syntax rules you need to apply to your
element.
Basic Syntax
The <!ELEMENT> declaration has the following syntax:
<!ELEMENT element_name content_model>
Here, element_name is the name of the element you're defining. The content model could
indicate a specific rule, data or another element.
The following examples show you how to use this syntax for defining your elements.
Plain Text
If an element should contain plain text, you define the element using #PCDATA. PCDATA stands
for Parsed Character Data and is the way you specify non-markup text in your DTDs.
Using this example - <name>XML Tutorial</name> — the XML Tutorial part is the PCDATA. The
other part consists of markup.
Syntax:
<!ELEMENT element_name (#PCDATA)>
Example:
<!ELEMENT name (#PCDATA)>
The above line in your DTD allows the name element to contain non-markup data in your XML
document:
<name>XML Tutorial</name>
Unrestricted Elements
If it doesn't matter what your element contains, you can create an element using the
content_model of ANY. Note that doing this removes all syntax checking, so you should avoid
using this if possible. You're better off defining a specific content model.
Syntax:
<!ELEMENT element_name ANY>
Example:
<!ELEMENT tutorials ANY>
Empty Elements
You might remember that an empty element is one without a closing tag. For example,
in XHTML, the <br /> and <img /> tags are empty elements. Here's how you define an empty
element:
Syntax:
<!ELEMENT element_name EMPTY>
Example:
<!ELEMENT header EMPTY>
The above line in your DTD defines the following empty element for your XML document:
<header />
Child Elements
You can specify that an element must contain another element, by providing the name of the
element it must contain. Here's how you do that:
Syntax:
Example:
The above line in your DTD allows the tutorials element to contain one instance of the tutorial
element in your XML document:
<tutorials>
<tutorial></tutorial>
</tutorials>
You can also provide a comma separated list of elements if it needs to contain more than one
element. This is referred to as a sequence. The XML document must contain the tags in the
same order that they're specified in the sequence.
Syntax:
Example:
The above line in your DTD allows the tutorial element to contain one instance of the name
element and one instance of the url element in your XML document:
<tutorials>
<tutorial>
<name></name>
<url></url>
</tutorial>
</tutorials>
This is fine if there only needs one instance of tutorial, but what if we didn't want a limit. What
if the tutorials element should be able to contain any number of tutorial instances? Fortunately
we can do that using DTD operators.
Here's a list of operators/syntax rules we can use when defining child elements:
? a? Either a or nothing
, a, b a followed by b
| a|b a or b
Zero or More
To allow zero or more of the same child element, use an asterisk (*):
Syntax:
Example:
To allow one or more of the same child element, use a plus sign (+):
Syntax:
Example:
Zero or One
To allow either zero or one of the same child element, use a question mark (?):
Syntax:
Example:
Choices
You can define a choice between one or another element by using the pipe (|) operator. For
example, if the tutorial element requires a child called either name, title, or subject (but only
one of these), you can do the following:
Syntax:
Example:
Mixed Content
You can use the pipe (|) operator to specify that an element can contain both PCDATA and
other elements:
Syntax:
Example:
Syntax:
Example:
The above example allows the tutorial element to contain one or more instance of the name
element, and zero or one instance of the url element.
Subsequences
You can use parentheses to create a subsequence (i.e. a sequence within a sequence). This
enables you to apply DTD operators to a subsequence:
Syntax:
Example:
The above example specifies that the tutorial element can contain one or more author
elements, with each occurrence having an optional rating element.
DTD Attributes
Just as we need to define all elements in your DTD, we also need to define any attributes they
use. We use the <!ATTLIST> declaration to define attributes in DTD.
Syntax
You use a single <!ATTLIST> declaration to declare all attributes for a given element. In other
words, for each element (that contains attributes), you only need one <!ATTLIST> declaration.
<!ATTLIST element_name
attribute_name TYPE DEFAULT_VALUE
attribute_name TYPE DEFAULT_VALUE
attribute_name TYPE DEFAULT_VALUE
...>
Here, element_name refers to the element that you're defining attributes for, attribute_name
is the name of the attribute that you're declaring, TYPE is the attribute type, and
DEFAULT_VALUE is its default value.
Example
<!ATTLIST tutorial
published CDATA "No">
Here, we are defining an attribute called published for the tutorial element. The attribute's type
is CDATA and its default value is No.
We defined an attribute using a default value of No. In this lesson, we look at the various
options for defining default values for your attributes.
Default Values
The attribute TYPE field can be set to one of the following values:
value Description
value A simple text value, enclosed in quotes.
#IMPLIED Specifies that there is no default value for this
attribute, and that the attribute is optional.
#REQUIRED There is no default value for this attribute, but
a a value must be assigned.
#FIXED The #FIXED part specifies that the value must
be the value provided. The value part
represents the actual value.
Examples of these default values follow.
value
You can provide an actual value to be the default value by placing it in quotes.
sayntax:
Example:
#REQUIRED
The #REQUIRED keyword specifies that you won't be providing a default value, but that you
require that anyone using this DTD does provide one.
Syntax:
Example:
#IMPLIED
The #IMPLIED keyword specifies that you won't be providing a default value, and that the
attribute is optional for users of this DTD.
Syntax:
Example:
#FIXED
The #FIXED keyword specifies that you will provide value, and that's the only value that can be
used by users of this DTD.
Syntax:
Example:
So far, all our examples for declaring attributes have used the CDATA attribute type. CDATA is probably
the most common attribute type as it allows for plain text to be used for the attribute's value. There
may however, be cases where you need to use a different attribute type.
When setting attributes for your elements, the attribute TYPE field can be set to one of the following
values:
Type Description
CDATA Character Data (text that doesn't contain markup)
ENTITY The name of an entity (which must be declared in
the DTD)
ENTITIES A list of entity names, separated by whitespaces.
(All entities must be declared in the DTD)
Enumerated A list of values. The value of the attribute must be
one from this list.
ID A unique ID or name. Must be a valid XML name.
IDREF Represents the value of an ID attribute of another
element.
IDREFS Represents multiple IDs of elements, separated by
whitespace.
NMTOKEN A valid XML name.
NMTOKENS A list of valid XML names, separated by whitespace
NOTATION A notation name (which must be declared in the
DTD).
CDATA
As with all attribute types, the attribute type of CDATA is placed after the attribute name and
before the default value.
Syntax:
<!ATTLIST element_name
attribute_name CDATA default_value>
Example:
<!ATTLIST mountain
Valid XML - The following XML document would be valid, as it conforms to the above DTD:
<mountains>
<name>Mount Cook</name>
</mountain>
<mountain country="Australia">
<name>Cradle Mountain</name>
</mountain>
</mountains>
ENTITY
The attribute type of ENTITY is used for referring to the name of an entity you've declared in your DTD.
Syntax:
<!ATTLIST element_name
Example:
<!ATTLIST mountain
<mountains>
<mountain photo="mt_cook_1">
<name>Mount Cook</name>
</mountain>
<mountain>
<name>Cradle Mountain</name>
</mountain>
</mountains>
Invalid XML - The following XML document would be invalid. This is because the photo attribute of the
second element contains a value that hasn't been declared as an entity:
<mountains>
<mountain photo="mt_cook_1">
<name>Mount Cook</name>
</mountain>
<mountain photo="None">
<name>Cradle Mountain</name>
</mountain>
</mountains>
ENTITIES
The attribute type of ENTITIES allows you to refer to multiple entity names, separated by a
space.
Syntax:
<!ATTLIST element_name
attribute_name ENTITIES default_value>
Example:
<!ATTLIST mountain
Valid XML - The following XML document would be valid, as it conforms to the above DTD:
<mountains>
<name>Mount Cook</name>
</mountain>
<mountain>
<name>Cradle Mountain</name>
</mountain>
</mountains>
Invalid XML - The following XML document would be invalid. This is because in the first
element, a comma is being used to separate the two values of the photo attribute (a space
should be separating the two values):
<mountains>
<mountain photo="mt_cook_1,mt_cook_2">
<name>Mount Cook</name>
</mountain>
<mountain>
<name>Cradle Mountain</name>
</mountain>
</mountains>
Enumerated
The enumerated attribute type provides for a list of possible values. This enables the DTD user
to provide one value from the list of possible values.
The values must be surrounded by parentheses, and each value must be separated by a pipe
(|).
Syntax:
<!ATTLIST element_name
Example:
<!ATTLIST tutorial
Valid XML - The following XML document would be valid, as it conforms to the above DTD:
<tutorials>
<tutorial published="yes">
<name>XML Tutorial</name>
</tutorial>
<tutorial published="no">
<name>HTML Tutorial</name>
</tutorial>
<tutorial>
<name>CSS Tutorial</name>
</tutorial>
</tutorials>
Invalid XML - The following XML document would be invalid because the value of the first
attribute does not match one of the options of the ATTLIST declaration:
<tutorials>
<tutorial published="true">
<name>XML Tutorial</name>
</tutorial>
<tutorial published="no">
<name>HTML Tutorial</name>
</tutorial>
<tutorial>
<name>CSS Tutorial</name>
</tutorial>
</tutorials>
ID
Because of this, no two elements can contain the same value for attributes of type ID. Also, you
can only give an element one attribute of type ID. The value that is assigned to an attribute of
type ID must be a valid XML name.
Syntax:
<!ATTLIST element_name
attribute_name ID default_value>
Example:
<!ATTLIST mountain
mountain_id ID #REQUIRED>
Valid XML - The following XML document would be valid, as it conforms to the above DTD:
<mountains>
<mountain mountain_id="m10001">
<name>Mount Cook</name>
</mountain>
<mountain mountain_id="m10002">
<name>Cradle Mountain</name>
</mountain>
</mountains>
Invalid XML - The following XML document would be invalid because the value of the
mountain_id attribute is the same for both elements:
<mountains>
<mountain mountain_id="m10001">
<name>Mount Cook</name>
</mountain>
<mountain mountain_id="m10001">
<name>Cradle Mountain</name>
</mountain>
</mountains>
IDREF
The attribute type of IDREF is used for referring to an ID value of another element in the
document.
Syntax:
<!ATTLIST element_name
Example:
<!ATTLIST employee
employee_id ID #REQUIRED
Valid XML - The following XML document would be valid, as it conforms to the above DTD:
<employees>
<first_name>Homer</first_name>
<last_name>Flinstone</last_name>
</employee>
<employee employee_id="e10002">
<first_name>Fred</first_name>
<last_name>Burns</last_name>
</employee>
</employees>
Invalid XML - The following XML document would be invalid. This is because the manager_id
attribute of the second element contains a value that isn't the same as a value of another
element that contains an attribute with a type of ID:
<employees>
<first_name>Homer</first_name>
<last_name>Flinstone</last_name>
</employee>
<first_name>Fred</first_name>
<last_name>Burns</last_name>
</employee>
</employees>
IDREFS
The attribute type of IDREFS is used for referring to the ID values of more than one other
element in the document. Each value is separated by a space.
Syntax:
<!ATTLIST element_name
Example:
<!ATTLIST individual
individual_id ID #REQUIRED
Valid XML - The following XML document would be valid, as it conforms to the above DTD:
<individuals>
<first_name>Bart</first_name>
<last_name>Simpson</last_name>
</individual>
<individual individual_id="e10002">
<first_name>Homer</first_name>
<last_name>Simpson</last_name>
</individual>
<individual individual_id="e10003">
<first_name>Marge</first_name>
<last_name>Simpson</last_name>
</individual>
</individuals>
Invalid XML - The following XML document would be invalid. This is because the manager_id
attribute of the second element contains a value that isn't the same as a value of another
element that contains an attribute with a type of ID:
<employees>
<first_name>Homer</first_name>
<last_name>Flinstone</last_name>
</employee>
<first_name>Fred</first_name>
<last_name>Burns</last_name>
</employee>
</employees>
NMTOKEN
An NMTOKEN (name token) is any mixture of Name characters. It cannot contain whitespace
(although leading or trailing whitespace will be trimmed/ignored).
While Names have restrictions on the initial character (the first character of a Name cannot
include digits, diacritics, the full stop and the hyphen), the NMTOKEN doesn't have these
restrictions.
Syntax:
<!ATTLIST element_name
Example:
<!ATTLIST mountain
Valid XML - The following XML document would be valid, as it conforms to the above DTD:
<mountains>
<mountain country="NZ">
<name>Mount Cook</name>
</mountain>
<mountain country="AU">
<name>Cradle Mountain</name>
</mountain>
</mountains>
Invalid XML - The following XML document would be invalid because the value of the first
attribute contains internal whitespace:
<mountains>
<name>Mount Cook</name>
</mountain>
<mountain country="Australia">
<name>Cradle Mountain</name>
</mountain>
</mountains>
NMTOKENS
The attribute type of NMTOKENS allows the attribute value to be made up of multiple
NMTOKENSs, separated by a space.
Syntax:
<!ATTLIST element_name
Example:
<!ATTLIST mountains
Valid XML - The following XML document would be valid, as it conforms to the above DTD:
<mountain>
<name>Mount Cook</name>
</mountain>
<mountain>
<name>Cradle Mountain</name>
</mountain>
</mountains>
NOTATION
The attribute type of NOTATION allows you to use a value that has been declared as a notation
in the DTD.
A notation is used to specify the format of non-XML data. A common use of notations is to
describe MIME types such as image/gif, image/jpeg etc.
Syntax:
To declare a notation:
<!ATTLIST element_name
Example:
<!ATTLIST mountain
In the DTD, we have specified that the value of the photo_type attribute can be one of the
three values supplied. The following XML document would be valid, as it conforms to the above
DTD:
<mountains>
</mountain>
<mountain>
<name>Cradle Mountain</name>
</mountain>
</mountains>
XML Schema
XML Schema is an XML-based language used to create XML-based languages and data models.
An XML schema defines element and attribute names for a class of XML documents. The
schema also specifies the structure that those documents must adhere to and the type of
content that each element can hold.
XML documents that attempt to adhere to an XML schema are said to be instances of that
schema. If they correctly adhere to the schema, then they are valid instances. This is not the
same as being well formed. A well-formed XML document follows all the syntax rules of XML,
but it does not necessarily adhere to any particular schema. So, an XML document can be well
formed without being valid, but it cannot be valid unless it is well formed.
As a means of understanding the power of XML Schema, let's look at the limitations of DTD.
-DTDs allow only limited control over cardinality (the number of occurrences of an element
within its parent).
-DTDs do not support Namespaces or any simple way of reusing or importing other schemas.
Syntax
Example
<xs:complexType>
<xs:sequence>
<xs:element name = "name" type = "xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Elements
As we saw in the XML - previous chapter, elements are the building blocks of XML document.
An element can be defined within an XSD as follows −
Definition Types
Simple Type
Simple type element is used only in the context of the text. Some of the predefined simple
types are: xs:integer, xs:boolean, xs:string, xs:date. For example −
NOTATION
The other 25 built-in data types are derived from one of the primitive types listed above.
Code Sample:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Author">
<xs:complexType>
<xs:sequence>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Notice the FirstName and LastName elements in the code sample above. They are not explicitly
defined as simple type elements. Instead, the type is defined with the type attribute. Because
the value (string in both cases) is a simple type, the elements themselves are simple-type
elements.
Complex Type
A complex type is a container for other element definitions. This allows you to specify which
child elements an element can contain and to provide some structure within your XML
documents. For example −
<xs:complexType>
<xs:sequence>
</xs:sequence>
</xs:complexType>
</xs:element>
In the above example, Address element consists of child elements. This is a container for other
<xs:element> definitions, that allows to build a simple hierarchy of elements in the XML
document.
XML Attributes:
An attribute provides extra information within an element. Attributes have name and type
properties and are defined within an XSD as follows:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
---- C O D E O M I T T E D ----
<xs:element name="HomePage">
<xs:complexType>
<xs:attribute name="URL" type="xs:anyURI
"/>
</xs:complexType>
</xs:element>
---- C O D E O M I T T E D ----
</xs:schema>
Attributes are optional by default. To specify that the attribute is required, use the "use"
attribute:
<xs:attribute name="lang" type="xs:string" use="required"/>
Indicators
The content models are used to indicate the structure and order in which child elements can
appears within their parent element. Content models are made up of model groups. There are
three types of model groups are listed below.
Order indicators:
All
Choice
Sequence
All
The <all> indicator specifies that the child elements can appear in any order, and that each child
element must occur only once:
<xs:element name="person">
<xs:complexType>
<xs:all>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:element>
Choice Indicator
The <choice> indicator specifies that either one child element or another can occur:
<xs:element name="person">
<xs:complexType>
<xs:choice>
<xs:element name="employee" type="employee"/>
<xs:element name="member" type="member"/>
</xs:choice>
</xs:complexType>
</xs:element>
Sequence Indicator
The <sequence> indicator specifies that the child elements must appear in a specific order
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Occurrence Indicators
Occurrence indicators are used to define how often an element can occur.
maxOccurs Indicator
The <maxOccurs> indicator specifies the maximum number of times an element can occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string" maxOccurs="10"/>
</xs:sequence>
</xs:complexType>
</xs:element>
minOccurs Indicator
The <minOccurs> indicator specifies the minimum number of times an element can occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string"
maxOccurs="10" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The example above indicates that the "child_name" element can occur a minimum of zero
times and a maximum of ten times in the "person" element.
Restrictions are used to define acceptable values for XML elements or attributes. Restrictions
on XML elements are called facets.
The following example defines an element called "age" with a restriction. The value of age
cannot be lower than 0 or greater than 120:
<xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="120"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
To limit the content of an XML element to a set of acceptable values, we would use the
enumeration constraint.
The example below defines an element called "car" with a restriction. The only acceptable
values are: Audi, Golf, BMW:
<xs:element name="car">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
To limit the content of an XML element to define a series of numbers or letters that can be
used, we would use the pattern constraint.
The example below defines an element called "letter" with a restriction. The only acceptable
value is ONE of the LOWERCASE letters from a to z:
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "initials" with a restriction. The only acceptable
value is THREE of the UPPERCASE letters from a to z:
<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[A-Z][A-Z][A-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example also defines an element called "initials" with a restriction. The only
acceptable value is THREE of the LOWERCASE OR UPPERCASE letters from a to z:
<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-zA-Z][a-zA-Z][a-zA-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "choice" with a restriction. The only acceptable
value is ONE of the following letters: x, y, OR z:
<xs:element name="choice">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[xyz]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The next example defines an element called "prodid" with a restriction. The only acceptable
value is FIVE digits in a sequence, and each digit must be in a range from 0 to 9:
<xs:element name="prodid">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:pattern value="[0-9][0-9][0-9][0-9][0-9]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
To specify how whitespace characters should be handled, we would use the whiteSpace
constraint.
This example defines an element called "address" with a restriction. The whiteSpace constraint
is set to "preserve", which means that the XML processor WILL NOT remove any white space
characters:
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="preserve"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
This example also defines an element called "address" with a restriction. The whiteSpace
constraint is set to "replace", which means that the XML processor WILL REPLACE all white
space characters (line feeds, tabs, spaces, and carriage returns) with spaces:
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="replace"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
This example also defines an element called "address" with a restriction. The whiteSpace
constraint is set to "collapse", which means that the XML processor WILL REMOVE all white
space characters (line feeds, tabs, spaces, carriage returns are replaced with spaces, leading
and trailing spaces are removed, and multiple spaces are reduced to a single space):
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="collapse"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on Length
To limit the length of a value in an element, we would use the length, maxLength, and
minLength constraints.
This example defines an element called "password" with a restriction. The value must be
exactly eight characters:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
This example defines another element called "password" with a restriction. The value must be
minimum five characters and maximum eight characters:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:minLength value="5"/>
<xs:maxLength value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Constraint Description
enumeration Defines a list of acceptable values
fractionDigits Specifies the maximum number of decimal places allowed. Must be equal to or
greater than zero
length Specifies the exact number of characters or list items allowed. Must be equal to
or greater than zero
maxExclusive Specifies the upper bounds for numeric values (the value must be less than this
value)
maxInclusive Specifies the upper bounds for numeric values (the value must be less than or
equal to this value)
maxLength Specifies the maximum number of characters or list items allowed. Must be
equal to or greater than zero
minExclusive Specifies the lower bounds for numeric values (the value must be greater than
this value)
minInclusive Specifies the lower bounds for numeric values (the value must be greater than
or equal to this value)
minLength Specifies the minimum number of characters or list items allowed. Must be
equal to or greater than zero
pattern Defines the exact sequence of characters that are acceptable
totalDigits Specifies the exact number of digits allowed. Must be greater than zero
whiteSpace Specifies how white space (line feeds, tabs, spaces, and carriage returns) is
handled
XSL
XSL which stands for EXtensible Stylesheet Language. It is similar to XML as CSS is to HTML.
XSL is a language for expressing style sheets. An XSL style sheet is, like with CSS, a file that
describes how to display an XML document of a given type. XSL shares the functionality and is
compatible with CSS2 (although it uses a different syntax). It also adds:
In case of HTML document, tags are predefined such as table, div, and span; and the browser
knows how to add style to them and display those using CSS styles. But in case of XML
documents, tags are not predefined. In order to understand and style an XML document, World
Wide Web Consortium (W3C) developed XSL which can act as XML based Stylesheet Language.
An XSL document specifies how a browser should render an XML document.
XSLT − used to transform XML document into various other types of document.
XPath − used to navigate XML document.
XSL-FO − used to format XML document.
What is XSLT?
XSLT, Extensible Stylesheet Language Transformations, provides the ability to transform XML
data from one format to another automatically.
How XSLT Works?
An XSLT stylesheet is used to define the transformation rules to be applied on the target XML
document. XSLT stylesheet is written in XML format. XSLT Processor takes the XSLT stylesheet
and applies the transformation rules on the target XML document and then it generates a
formatted document in the form of XML, HTML, or text format. This formatted document is
then utilized by XSLT formatter to generate the actual output which is to be displayed to the
end-user.
Advantages
Let’s suppose we have the following sample XML file, students.xml, which is required to be
transformed into a well-formatted HTML document.
students.xml
<?xml version = "1.0"?>
<class>
<student rollno = "393">
<firstname>Dinkar</firstname>
<lastname>Kad</lastname>
<nickname>Dinkar</nickname>
<marks>85</marks>
</student>
<student rollno = "493">
<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>Vinni</nickname>
<marks>95</marks>
</student>
<student rollno = "593">
<firstname>Jasvir</firstname>
<lastname>Singh</lastname>
<nickname>Jazz</nickname>
<marks>90</marks>
</student>
</class>
We need to define an XSLT style sheet document for the above XML document to meet the following
criteria −
students.xsl
<html>
<body>
<h2>Students</h2>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
a. <xsl:template> element
<xsl:template> defines a way to reuse templates in order to generate the desired output for
nodes of a particular type/context.
Declaration
Attributes
1 name
Name of the element on which template is to be applied.
2 match
Pattern which signifies the element(s) on which template is to be applied.
3 priority
Priority number of a template. Matching template with low priority is not considered in from
in front of high priority template.
4 mode
Allows element to be processed multiple times to produce a different result each time.
Elements
Number of Unlimited
occurrences
Parent
xsl:stylesheet, xsl:transform
elements
b. <xsl:value-of> tag puts the value of the selected node as per XPath expression, as text.
Declaration
Following is the syntax declaration of <xsl:value-of> element.
<xsl:value-of
select = Expression
disable-output-escaping = "yes" | "no" >
</xsl:value-of>
Attributes
Sr.No Name & Description
1 Select: XPath Expression to be evaluated in current context.
2 disable-output escaping: Default-"no". If "yes", output text will not escape xml characters from
text.
c. <xsl:for-each> tag applies a template repeatedly for each node.
Declaration
Following is the syntax declaration of <xsl:for-each> element
<xsl:for-each
select = Expression >
</xsl:for-each>
Attributes
This example creates a table of <student> element with its attribute rollno and its child
<firstname>,<lastname><nickname> and <marks> by iterating over each student.
students.xml
<tr>
<td><xsl:value-of select = "@rollno"/></td>
<td><xsl:value-of select = "firstname"/></td>
<td><xsl:value-of select = "lastname"/></td>
<td><xsl:value-of select = "nickname"/></td>
<td><xsl:value-of select = "marks"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
Declaration
Following is the syntax declaration of <xsl:sort> element.
<xsl:sort
select = string-expression
lang = { nmtoken }
data-type = { "text" | "number" | QName }
order = { "ascending" | "descending" }
case-order = { "upper-first" | "lower-first" } >
</xsl:sort>
Attributes
Declaration
Following is the syntax declaration of <xsl:if> element.
<xsl:if
test = boolean-expression >
</xsl:if>
Attributes
Sr.No Name & Description
1 test: The condition in the xml data to test.
f. <xsl:choose> tag specifies a multiple conditional tests against the content of nodes in
conjunction with the <xsl:otherwise> and <xsl:when> elements.
Declaration
Following is the syntax declaration of <xsl:choose> element.
<xsl:choose >
</xsl:choose>
XQuery is a query-based language to retrieve data stored in the form of XML. XQuery is to XML
what SQL is to a database.
What is XQuery
XQuery is a functional language that is used to retrieve information stored in XML format.
XQuery can be used on XML documents, relational databases containing data in XML formats,
or XML Databases. XQuery 3.0 is a W3C recommendation from April 8, 2014.
“XQuery is a standardized language for combining documents, databases, Web pages and almost anything
else. It is very widely implemented. It is powerful and easy to learn. XQuery is replacing proprietary
middleware languages and Web Application development languages. XQuery is replacing complex Java or C++
programs with a few lines of code. XQuery is simpler to work with and easier to maintain than many other
alternatives.”
Characteristics
Functional Language − XQuery is a language to retrieve/querying XML based data.
Analogous to SQL − XQuery is to XML what SQL is to databases.
XPath based − XQuery uses XPath expressions to navigate through XML documents.
Universally accepted − XQuery is supported by all major databases.
W3C Standard − XQuery is a W3C standard.
Benefits of XQuery
Using XQuery, both hierarchical and tabular data can be retrieved.
XQuery can be used to query tree and graphical structures.
XQuery can be directly used to query webpages.
XQuery can be directly used to build webpages.
XQuery can be used to transform xml documents.
XQuery is ideal for XML-based databases and object-based databases. Object databases are much
more flexible and powerful than purely tabular databases.
books.xml
<book category="JAVA">
<author>Robert</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="DOTNET">
<author>Peter</author>
<year>2011</year>
<price>40.50</price>
</book>
<book category="XML">
<author>Robert</author>
<author>Peter</author>
<year>2013</year>
<price>50.00</price>
</book>
<book category="XML">
<author>Jay Ban</author>
<year>2010</year>
<price>16.50</price>
</book>
</books>
books.xqy
for $x in doc("books.xml")/books/book
where $x/price>30
return $x/title
Output
XPath
XPath is a query language that is used for traversing through an XML document. It is used
commonly to search particular elements or attributes with matching patterns.
This tutorial explains the basics of XPath. It contains chapters discussing all the basic
components of XPath with suitable examples.
What is XPath?
XPath is an official recommendation of the World Wide Web Consortium (W3C). It defines a
language to find information in an XML file. It is used to traverse elements and attributes of an
XML document. XPath provides various types of expressions which can be used to enquire
relevant information from the XML document.
Structure Definitions − XPath defines the parts of an XML document like element, attribute, text,
namespace, processing-instruction, comment, and document nodes
Path Expressions − XPath provides powerful path expressions select nodes or list of nodes in XML
documents.
Standard Functions − XPath provides a rich library of standard functions for manipulation of string
values, numeric values, date and time comparison, node and QName manipulation, sequence
manipulation, Boolean values etc.
Major part of XSLT − XPath is one of the major elements in XSLT standard and is must have
knowledge in order to work with XSLT documents.
W3C recommendation − XPath is an official recommendation of World Wide Web Consortium
(W3C).
One should keep the following points in mind, while working with XPath −
An XPath expression generally defines a pattern in order to select a set of nodes. These patterns are
used by XSLT to perform transformations or by XPointer for addressing purpose.
XPath specification specifies seven types of nodes which can be the output of execution of the XPath
expression.
Root
Element
Text
Attribute
Comment
Processing Instruction
Namespace
XPath uses a path expression to select node or a list of nodes from an XML document.
Following is the list of useful paths and expression to select any node/ list of nodes from an XML
document.
6 @ Selects attributes
8 class/student Example − Selects all student elements that are children of class
9 //student Selects all student elements no matter where they are in the document
students.xml
<class>
<firstname>Dinkar</firstname>
<lastname>Kad</lastname>
<nickname>Dinkar</nickname>
<marks>85</marks>
</student>
<firstname>Vaneet</firstname>
<lastname>Gupta</lastname>
<nickname>Vinni</nickname>
<marks>95</marks>
</student>
<firstname>Jasvir</firstname>
<lastname>Singh</lastname>
<nickname>Jazz</nickname>
<marks>90</marks>
</student>
</class>
students.xsl
xmlns:xsl = "http://www.w3.org/1999/XSL/Transform">
<html>
<body>
<h2>Students</h2>
<th>Roll No</th>
<th>First Name</th>
<th>Last Name</th>
<th>Nick Name</th>
<th>Marks</th>
</tr>
<tr>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
SAX
SAX, also known as the Simple API for XML, is used for parsing XML documents.
SAX is an API used to parse XML documents. It is based on events generated while reading
through the document. Callback methods receive those events. A custom handler contains
those callback methods.
The API is efficient because it drops events right after the callbacks received them.
Therefore, SAX has efficient memory management, unlike DOM, for example.
SAX vs DOM
DOM stands for Document Object Model. The DOM parser does not rely on events. Moreover,
it loads the whole XML document into memory to parse it. SAX is more memory-efficient than
DOM.
DOM has its benefits, too. For example, DOM supports XPath. It makes it also easy to operate
on the whole document tree at once since the document is loaded into memory.
DOM
The Document Object Model (DOM) is an application programming interface (API) for HTML
and XML documents. It defines the logical structure of documents and the way a document is
accessed and manipulated.
DOM defines the objects and properties and methods (interface) to access all XML elements. It
is separated into 3 different parts / levels −
Core DOM − standard model for any structured document
XML DOM − standard model for XML documents
HTML DOM − standard model for HTML documents
XML DOM is a standard object model for XML. XML documents have a hierarchy of
informational units called nodes; DOM is a standard programming interface of describing those
nodes and the relationships between them.
As XML DOM also provides an API that allows a developer to add, edit, move or remove nodes
at any point on the tree in order to create an application.
It consumes more memory (if the XML structure is large) as program written once
remains in memory all the time until and unless removed explicitly.
Due to the extensive usage of memory, its operational speed, compared to SAX is
slower.
Now that we know what DOM means, let's see what a DOM structure is. A DOM document is a
collection of nodes or pieces of information, organized in a hierarchy. Some types
of nodes may have child nodes of various types and others are leaf nodes that cannot have
anything under them in the document structure. Following is a list of the node types, with a
list of node types that they may have as children −
Document − Element (maximum of one), ProcessingInstruction, Comment,
DocumentType (maximum of one)
DocumentFragment − Element, ProcessingInstruction, Comment, Text, CDATASection,
EntityReference
EntityReference − Element, ProcessingInstruction, Comment, Text, CDATASection,
EntityReference
Element − Element, Text, Comment, ProcessingInstruction, CDATASection,
EntityReference
Attr − Text, EntityReference
ProcessingInstruction − No children
Comment − No children
Text − No children
CDATASection − No children
Entity − Element, ProcessingInstruction, Comment, Text, CDATASection,
EntityReference
Notation − No children
Example
DOM as an API contains interfaces that represent different types of information that can be
found in an XML document, such as elements and text. These interfaces include the methods
and properties necessary to work with these objects. Properties define the characteristic of
the node whereas methods give the way to manipulate the nodes.
Following table lists the DOM classes and interfaces −
1 DOMImplementation
It provides a number of methods for performing operations that are independent of any
particular instance of the document object model.
DocumentFragment
2 It is the "lightweight" or "minimal" document object, and it (as the superclass of
Document) anchors the XML/HTML tree in a full-fledged document.
Document
3 It represents the XML document's top-level node, which provides access to all the nodes in
the document, including the root element.
Node
4
It represents XML node.
NodeList
5
It represents a read-only list of Node objects.
NamedNodeMap
6
It represents collections of nodes that can be accessed by name.
Data
7 It extends Node with a set of attributes and methods for accessing character data in the
DOM.
Attribute
8
It represents an attribute in an Element object.
Element
9
It represents the element node. Derives from Node.
Text
10
It represents the text node. Derives from CharacterData.
Comment
11
It represents the comment node. Derives from CharacterData.
ProcessingInstruction
12 It represents a "processing instruction". It is used in XML as a way to keep processor-
specific information in the text of the document.
CDATA Section
13
It represents the CDATA Section. Derives from Text.
Entity
14
It represents an entity. Derives from Node.
EntityReference
15
This represent an entity reference in the tree. Derives from Node.
Parser
A parser is a software application that is designed to analyze a document, in our case XML
document and do something specific with the information. Some of the DOM based parsers
are listed in the following table −
1 JAXP
Sun Microsystem’s Java API for XML Parsing (JAXP)
2 XML4J
IBM’s XML Parser for Java (XML4J)
3 msxml
Microsoft’s XML parser (msxml) version 2.0 is built-into Internet Explorer 5.5
4 4DOM
4DOM is a parser for the Python programming language
5 XML::DOM
XML::DOM is a Perl module to manipulate XML documents using Perl
6 Xerces
Apache’s Xerces Java Parser