XML Documents - Xquery Xpath
XML Documents - Xquery Xpath
Start Declaration − Begin the XML declaration with the following statement.
<?xml version = "1.0" encoding = "UTF-8" standalone = "yes" ?>
DTD − Immediately after the XML header, the document type declaration follows,
commonly referred to as the DOCTYPE −
<!DOCTYPE address [
The DOCTYPE declaration has an exclamation mark (!) at the start of the element
name. The DOCTYPE informs the parser that a DTD is associated with this XML
document.
DTD Body − The DOCTYPE declaration is followed by body of the DTD, where you
declare elements, attributes, entities, and notations.
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone_no (#PCDATA)>
Several elements are declared here that make up the vocabulary of the <name>
document. <!ELEMENT name (#PCDATA)> defines the element name to be of type
"#PCDATA". Here #PCDATA means parse-able text data.
End Declaration − Finally, the declaration section of the DTD is closed using a closing
bracket and a closing angle bracket (]>). This effectively ends the definition, and
thereafter, the XML document follows immediately.
Rules
The document type declaration must appear at the start of the document
(preceded only by the XML header) − it is not permitted anywhere else within
the document.
Similar to the DOCTYPE declaration, the element declarations must start with
an exclamation mark.
The Name in the document type declaration must match the element type of
the root element.
External DTD
In external DTD elements are declared outside the XML file. They are accessed by
specifying the system attributes which may be either the legal .dtd file or a valid URL.
To refer it as external DTD, standalone attribute in the XML declaration must be set
as no. This means, declaration includes information from the external source.
Syntax
Following is the syntax for external DTD −
<!DOCTYPE root-element SYSTEM "file-name">
where file-name is the file with .dtd extension.
Example
The following example shows external DTD usage −
<?xml version = "1.0" encoding = "UTF-8" standalone = "no" ?>
<!DOCTYPE address SYSTEM "address.dtd">
<address>
<name>Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
</address>
XML elements
XML elements can be defined as building blocks of an XML. Elements can behave as
containers to hold text, elements, attributes, media objects or all of these.
Each XML document contains one or more elements, the scope of which are either
delimited by start and end tags, or for empty elements, by an empty-element tag.
Syntax
Following is the syntax to write an XML element −
<element-name attribute1 attribute2>
....content
</element-name>
where,
element-name is the name of the element. The name its case in the start and
end tags must match.
attribute1, attribute2 are attributes of the element separated by white spaces.
An attribute defines a property of the element. It associates a name with a
value, which is a string of characters. An attribute is written as −
name = "value"
name is followed by an = sign and a string value inside double(" ") or single(' ')
quotes.
Empty Element
An empty element (element with no content) has following syntax −
<name attribute1 attribute2.../>
Following is an example of an XML document using various XML element −
<?xml version = "1.0"?>
<contact-info>
<address category = "residence">
<name>Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
</address>
</contact-info>
XML Elements Rules
Following rules are required to be followed for XML elements −
An element name can contain any alphanumeric characters. The only
punctuation mark allowed in names are the hyphen (-), under-score (_) and
period (.).
Names are case sensitive. For example, Address, address, and ADDRESS are
different names.
Start and end tags of an element must be identical.
An element, which is a container, can contain text or elements as seen in the
above example.
XML - Attributes
Attributes are part of XML elements. An element can have multiple unique
attributes. Attribute gives more information about XML elements. To be more
precise, they define properties of elements. An XML attribute is always a name-value
pair.
Syntax
<garden>
<plants category = "flowers" />
<plants category = "shrubs">
</plants>
</garden>
Attributes are used to distinguish among elements of the same name, when you do
not want to create a new element for every situation. Hence, the use of an attribute
can add a little more detail in differentiating two or more similar elements.
In the above example, we have categorized the plants by including attribute category
and assigning different values to each of the elements. Hence, we have two
categories of plants, one flowers and other shrubs. Thus, we have two plant elements
with different attributes.
Attribute Types
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book>
<title lang="en">Harry Potter</title>
<price>29.99</price>
</book>
<book>
<title lang="en">Learning XML</title>
<price>39.95</price>
</book>
</bookstore>
Selecting Nodes
XPath uses path expressions to select nodes in an XML document. The node is selected by
following a path or steps. The most useful path expressions are listed below:
Expression Description
// Selects nodes in the document from the current node that match the
selection no matter where they are
@ Selects attributes
XQuery :
XQuery is a language for finding and extracting elements and attributes from XML
documents.
XQuery is the language for querying XML data
XQuery for XML is like SQL for databases
XQuery is built on XPath expressions
XQuery is supported by all major databases
XQuery is a W3C Recommendation
XQuery Example
for $x in doc("books.xml")/bookstore/book
where $x/price>30
order by $x/title
return $x/title
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
Functions
doc("books.xml")
Path Expressions
XQuery uses path expressions to navigate through elements in an XML document.
In the table below we have listed some path expressions and the result of the expressions:
Path Expression Result
//book Selects all book elements no matter where they are in the document
bookstore//book Selects all book elements that are descendant of the bookstore element, no
matter where they are under the bookstore element
Predicates
Predicates are used to find a specific node or a node that contains a specific value.
Predicates are always embedded in square brackets.
In the table below we have listed some path expressions with predicates and the result of the
expressions:
/bookstore/book[1] Selects the first book element that is the child of the
bookstore element.
/bookstore/book[last()] Selects the last book element that is the child of the
bookstore element
/bookstore/book[last()-1] Selects the last but one book element that is the child of
the bookstore element
/bookstore/book[position()<3] Selects the first two book elements that are children of the
bookstore element
//title[@lang] Selects all the title elements that have an attribute named
lang
//title[@lang='en'] Selects all the title elements that have a "lang" attribute
with a value of "en"
/bookstore/book[price>35.00] Selects all the book elements of the bookstore element that
have a price element with a value greater than 35.00
/bookstore/book[price>35.00]/title Selects all the title elements of the book elements of the
bookstore element that have a price element with a value
greater than 35.00
The following path expression is used to select all the title elements in the "books.xml" file:
doc("books.xml")/bookstore/book/title
(/bookstore selects the bookstore element, /book selects all the book elements under the
bookstore element, and /title selects all the title elements under each book element)
The XQuery above will extract the following:
Predicates
XQuery uses predicates to limit the extracted data from XML documents.
The following predicate is used to select all the book elements under the bookstore element
that have a price element with a value that is less than 30:
doc("books.xml")/bookstore/book[price<30]
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>