0% found this document useful (0 votes)
12 views41 pages

WSX UNIT I 2023-24

The document provides an overview of XML, highlighting its advantages over HTML, EDI, and databases, and explaining its syntax, characteristics, and structure. It emphasizes XML's role in data transport and storage, its flexibility in defining custom tags, and its compatibility across different platforms. Additionally, it discusses XML-based standards such as XPath, XSLT, and XQuery, and introduces the concept of XML namespaces to prevent element name conflicts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views41 pages

WSX UNIT I 2023-24

The document provides an overview of XML, highlighting its advantages over HTML, EDI, and databases, and explaining its syntax, characteristics, and structure. It emphasizes XML's role in data transport and storage, its flexibility in defining custom tags, and its compatibility across different platforms. Additionally, it discusses XML-based standards such as XPath, XSLT, and XQuery, and introduces the concept of XML namespaces to prevent element name conflicts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

IT-T72 Web Services and XML UNIT I

UNIT I

XML – benefits – Advantages of XML over HTML, EDI, Databases – XML based
standards – Structuring with schemas - DTD – XML Schemas – XML processing –
DOM –SAX – presentation technologies – XSL – XFORMS – XHTML – Transformation –
XSLT – XLINK – XPATH – XQuery.

1. Give the brief introduction about XML with its advantages.


Introduction
XML
 XML stands for Extensible Markup Language
 XML is a markup language much like HTML
 XML was designed to carry data, not to display data
 XML tags are not predefined. You must define your own tags
 XML is designed to be self-descriptive
 XML is a W3C Recommendation
XML is not a replacement for HTML.
 XML and HTML were designed with different goals:
 XML was designed to transport and store data, with focus on what data is
 HTML was designed to display data, with focus on how data looks
 HTML is about displaying information, while XML is about carrying information.
Example XML document:
An XML document is one that follows certain syntax rules (most of which we followed
for XHTML)

<?xml version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

IV Year/VII Sem 1
IT-T72 Web Services and XML UNIT I
The Difference between XML and HTML
 XML is not a replacement for HTML. XML and HTML were designed with
different goals:
 XML was designed to transport and store data, with focus on what data is.
 HTML was designed to display data, with focus on how data looks.
 HTML is about displaying information, while XML is about carrying information.

XML Syntax
• An XML document consists of
– Markup
• Tags, which begin with < and end with >
• References, which begin with & and end with ;
– Character, e.g. &#x20;
– Entity, e.g. &lt;
» The entities lt, gt, amp, apos, and quot are
recognized in every XML document.
» Other XHTML entities, such as nbsp, are only
recognized in other XML documents if they are
defined in the DTD
– Character data: everything not markup
• Comments
– Begin with <!--
– End -->
• CDATA section
– Special element the entire content of which is interpreted as character
data, even if it appears to be markup
IV Year/VII Sem 2
IT-T72 Web Services and XML UNIT I
– Begins with <![CDATA[
– Ends with ]]> (illegal except when ending CDATA)
• < and & must be represented by references except
– When beginning markup
– Within comments
– Within CDATA sections
• Element tags and elements
– Three types
• Start, e.g. <message>
• End, e.g. </message>
• Empty element, e.g. <br />
– Start and end tags must properly nest
– Corresponding pair of start and end element tags plus everything in
between them defines an element
– Character data may only appear within an element
XML Characteristics: (Benefits)
XML Does Not DO Anything
 Maybe it is a little hard to understand, but XML does not DO anything. XML
was created to structure, store, and transport information.
 The following example is a note to Tove, from Jani, stored as XML:
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
With XML You Invent Your Own Tags
» The tags in the example above (like <to> and <from>) are not defined in any
XML standard. These tags are "invented" by the author of the XML document.
» That is because the XML language has no predefined tags.
» The tags used in HTML are predefined. HTML documents can only use tags
defined in the HTML standard (like <p>, <h1>, etc.).
» XML allows the author to define his/her own tags and his/her own document
structure.

IV Year/VII Sem 3
IT-T72 Web Services and XML UNIT I

XML is Not a Replacement for HTML


XML is a complement to HTML.
 It is important to understand that XML is not a replacement for HTML. In most
web applications, XML is used to transport data, while HTML is used to format
and display the data.
 My best description of XML is this:
XML is a software- and hardware-independent tool for carrying information.
XML is a W3C Recommendation
 XML became a W3C Recommendation on February 10, 1998.
XML is Everywhere
 XML is now as important for the Web as HTML was to the foundation of the
Web.
 XML is the most common tool for data transmissions between all sorts of
applications.
XML is used in many aspects of web development, often to simplify data storage
and sharing
XML Separates Data from HTML
» If you need to display dynamic data in your HTML document, it will take a lot of
work to edit the HTML each time the data changes.
» With XML, data can be stored in separate XML files. This way you can
concentrate on using HTML/CSS for display and layout, and be sure that
changes in the underlying data will not require any changes to the HTML.
» With a few lines of JavaScript code, you can read an external XML file and
update the data content of your web page.
XML Simplifies Data Sharing
» In the real world, computer systems and databases contain data in
incompatible formats.
» XML data is stored in plain text format. This provides a software- and
hardware-independent way of storing data.
» This makes it much easier to create data that can be shared by different
applications.
XML Simplifies Data Transport
» One of the most time-consuming challenges for developers is to exchange data
between incompatible systems over the Internet.

IV Year/VII Sem 4
IT-T72 Web Services and XML UNIT I
» Exchanging data as XML greatly reduces this complexity, since the data can be
read by different incompatible applications.
XML Simplifies Platform Changes
» Upgrading to new systems (hardware or software platforms), is always time
consuming. Large amounts of data must be converted and incompatible data is
often lost.
» XML data is stored in text format. This makes it easier to expand or upgrade to
new operating systems, new applications, or new browsers, without losing data.
XML Makes Your Data More Available
» Different applications can access your data, not only in HTML pages, but also
from XML data sources.
» With XML, your data can be available to all kinds of "reading machines"
(Handheld computers, voice machines, news feeds, etc), and make it more
available for blind people, or people with other disabilities.
XML is Used to Create New Internet Languages
A lot of new Internet languages are created with XML.
Here are some examples:
 XHTML
 WSDL for describing available web services
 WAP and WML as markup languages for handheld devices
 RSS languages for news feeds
 RDF and OWL for describing resources and ontology
 SMIL for describing multimedia for the web
XML documents form a tree structure that starts at "the root" and branches to
"the leaves".
XML Documents Form a Tree Structure
 XML documents must contain a root element. This element is "the parent" of all
other elements.
 The elements in an XML document form a document tree. The tree starts at the
root and branches to the lowest level of the tree.
 All elements can have sub elements (child elements):
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
IV Year/VII Sem 5
IT-T72 Web Services and XML UNIT I
 The terms parent, child, and sibling are used to describe the relationships
between elements. Parent elements have children. Children on the same level
are called siblings (brothers or sisters).
 All elements can have text content and attributes (just like in HTML).

Example:

The image above represents one book in the XML below:


<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
IV Year/VII Sem 6
IT-T72 Web Services and XML UNIT I
 The root element in the example is <bookstore>. All <book> elements in the
document are contained within <bookstore>.
 The <book> element has 4 children: <title>,< author>, <year>, <price>.
Advantages of XML over HTML
XML syntax closely resembles HTML; data is enclosed between opening and closing
tags. However, XML is more flexible than HTML:
 XML encodes data in tightly-validated tree structures. Data is easy to locate
since its context is well defined by tags and rules of structure.
 HTML attempts to control the appearance and presentation of data, while XML
does not. XML defines data separately from its presentation. This makes XML
data easier to locate and manipulate.
 XML is a standard data format that permits applications to exchange
information across platforms and operating systems. HTML is markup used to
display information in a web browser.
 XML is open and extensible. XML authors can create their own tags. HTML is
limited by a fixed vocabulary that browser developers have agreed to support.
 XML is universally compatible. The XML file format is not tied to any particular
program, operating system, database, or network. XML can be used by non-
web applications to store data.
 XML files can be transformed into other types of documents. Transformation is
controlled using XSL style sheets. (XSL stands for Extensible Style Language).
Advantages of XML over EDI
 EDI adoption has been fairly wide spread, even though mainly among larger-
sized businesses.
 The cost of EDI implementation and ongoing maintenance can be measured in
the billions in aggregate.
 Millions of dollars in transactions occur on a daily basis using EDI-mediated
messages. It would be very difficult, if not impossible, to up root all this activity
and replace it with exclusively XML-based transactions.
 These businesses have so much money and time invested in ANSIX 12/EDI
that they will be fairly slow to adopt a new standard, which would necessitate
new processing technology, mapping software, and back-end integration.
 For them, it would seem that they would need to discard their existing, working
technology in favor of an unproven and still immature technology.
1) XML is a good replacement for EDI because it uses the Internet for the data
exchange.
IV Year/VII Sem 7
IT-T72 Web Services and XML UNIT I
2) Compared to EDI and other electronic commerce and data-interchange
standards, XML offers serious cost savings and efficiency enhancements that
make implementation of XML good for the bottom line.
3) XML‘s built-in validity checking, low-cost parsers and processing tools,
Extensible Style sheet Language (XSL) based mapping, and use of the Internet
keep down much of the commerce chain cost.
4) The use of the Internet itself greatly lowers the barrier for small and medium-
sized companies that have found EDI too costly to implement.
5) The idea that XML represents a new, fresh approach to solving many lingering
problems in a flexible manner appeals to many in senior management.
6) XML syntax allows for international characters that follow the Unicode standard
to be included as content in any XML element.
Advantages of XML over Databases
 Relational and object-oriented databases and formats can represent data as
well as meta- data, but for the most part, their formats are not text based.
 Most databases use a proprietary binary format to represent their information.
There are other text-based formats that include metadata regarding information
and are structured in a hierarchical representation, but they have not caught
on in popularity nearly to the extent that XML or even SGML has.
2. Explain in detail about the XML based standards and XML Namespace. (Nov
16)
XML BASED STANDARDS:
1) XPATH
 XPath is a syntax for defining parts of an XML document. XPath uses path
expressions to navigate in XML documents. XPath contains a library of
standard functions. XPath is a major element in XSLT. XPath is a W3C
Standard
2) XSD
 It defines elements that can appear in a document. defines attributes that can
appear in a document. It defines which elements are child elements. defines the
order of child elements. It defines the number of child elements. It defines
whether an element is empty or can include text. It defines data types for
elements and attributes. It defines default and fixed values for elements and
attributes
3) XSL

IV Year/VII Sem 8
IT-T72 Web Services and XML UNIT I
 XSL describes how the XML document should be displayed! XSL consists of
three parts: XSLT - a language for transforming XML documents, XPath - a
language for navigating in XML documents, XSL-FO - a language for formatting
XML documents
4) XSLT
 A common way to describe the transformation process is to say that XSLT
transforms an XML source-tree into an XML result-tree.XSLT stands for XSL
Transformations. XSLT is the most important part of XSL. XSLT transforms an
XML document into another XML document. XSLT uses XPath to navigate in
XML documents. XSLT is a W3C Recommendation.
5) XFORMS:
 XForms is the next generation of HTML forms. XForms is richer and more
flexible than HTML forms. XForms will be the forms standard in XHTML 2.0.
XForms is platform and device independent.
 XForms separates data and logic from presentation. XForms uses XML to define
form data.
 XForms stores and transports data in XML documents.
 XForms contains features like calculations and validations of forms. XForms
reduces or eliminates the need for scripting. XForms is a W3C
Recommendation. The XForms Model. The XForms model is used to describe
the data.
6) XQuery:
 XQuery is the language for querying XML data.
 XQuery for XML is like SQL for databases.
 XQuery is built on XPath expressions.
 XQuery is supported by all the major database engines (IBM, Oracle,
Microsoft, etc.).
 XQuery is a W3C Recommendation.
XML Namespaces
XML Namespaces provide a method to avoid element name conflicts.
Name Conflicts
In XML, element names are defined by the developer. This often results in a conflict
when trying to mix XML documents from different XML applications.
This XML carries HTML table information:

IV Year/VII Sem 9
IT-T72 Web Services and XML UNIT I
<table>
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>
This XML carries information about a table (a piece of furniture):
<table>
<name>African Coffee Table</name>
<width>80</width>
<length>120</length>
</table>
If these XML fragments were added together, there would be a name conflict. Both
contain a <table> element, but the elements have different content and meaning.
An XML parser will not know how to handle these differences.
Solving the Name Conflict Using a Prefix
Name conflicts in XML can easily be avoided using a name prefix.
This XML carries information about an HTML table, and a piece of furniture:
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
In the example above, there will be no conflict because the two <table> elements have
different names.
XML Namespaces - The xmlns Attribute
When using prefixes in XML, a so-called namespace for the prefix must be defined.
The namespace is defined by the xmlns attribute in the start tag of an element.
The namespace declaration has the following syntax. xmlns:prefix="URI".

IV Year/VII Sem 10
IT-T72 Web Services and XML UNIT I
<root>
<h:table xmlns:h="http://www.w3.org/TR/html4/">
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>

<f:table xmlns:f="http://www.w3schools.com/furniture">
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
 In the example above, the xmlns attribute in the <table> tag give the h: and f:
prefixes a qualified namespace.
 When a namespace is defined for an element, all child elements with the same
prefix are associated with the same namespace.
 Namespaces can be declared in the elements where they are used or in the XML
root element:
<root xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3schools.com/furniture">

<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>

IV Year/VII Sem 11
IT-T72 Web Services and XML UNIT I
Default Namespaces
Default namespace for elements of a document is specified using a form of the xmlns
attribute:

• Another form of xmlns attribute known as a namespace declaration can be used


to associate a namespace prefix with a namespace name:

Namespace
prefix

Namespace
declaration
Example use of namespace prefix:

• In a namespace-aware XML application, all element and attribute names are


considered qualified names
– A qualified name has an associated expanded name that consists of a
namespace name and a local name
– Ex: item is a qualified name with expanded name <null, item>
– Ex: xhtml:a is a qualified name with expanded name
<http://www.w3.org/1999/xhtml, a>
3. Explain in detail about the structuring with schema. (XML Schema, DTD). (Apr
17)
XML document includes the following
• The xml declaration
• The document type declaration
• The element data
• The attribute data
• The character data or XML content
STRUCTURING WITH SCHEMAS:
RULES FOR XML STRUCTURE:
1. All XML elements must have a closing tag.

IV Year/VII Sem 12
IT-T72 Web Services and XML UNIT I
2. XML tags are case sensitive, All XML elements must have a proper nesting.
3. All XML Documents must contain a single root element.
4. Attribute values must be quoted.
5. Attributes may only appear once in the same start tag.
6. Attribute values cannot contain references to external entities.
7. All entities except amp, lt, gt, apos and quot must be declared before they
are used.
XML Schema:
The XML schema are used to represent the structure of XML document. The goal or
purpose of XML schema is to define the building block of an XML document. These
can be used as an alternative to XML DTD. The schema language is called as XML
schema definition language.
 The purpose of an XML Schema is to define the legal building blocks of an XML
document, just like a DTD. Here are some reasons:
 XML Schemas are extensible to future additions
 XML Schemas are richer and more powerful than DTDs
 XML Schemas are written in XML
XML schema defines elements, attributes, elements having child elements order of
child elements. It also defines fixed and default values of elements and attributes.
TWO TYPES OF SCHEMAS:
1. SIMPLE TYPE,
2. COMPLEX TYPE
SIMPLE TYPE:
XML Schema has a lot of built-in data types. The most common types are:
– xs:string
– xs:decimal
– xs:integer
– xs:boolean
– xs:date
– xs:time
Example:
Here are some XML elements:
<lastname>Refsnes</lastname>
<age>36</age>
<dateborn>1970-03-27</dateborn>
And here are the corresponding simple element definitions:
IV Year/VII Sem 13
IT-T72 Web Services and XML UNIT I
<xs:element name="lastname" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="dateborn" type="xs:date"/>

COMPLEX TYPE:
A complex element is an XML element that contains other elements and/or attributes.
Look at this simple XML document called "note.xml":
<?xml version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget to submit the assignment this
monday!</body>
</note>
The following example is a DTD file called "note.dtd" that defines the elements of the
XML document above ("note.xml"):
<!ELEMENT note (to, from, heading, body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
The following example is an XML Schema file called "note.xsd" that defines the
elements of the XML document above ("note.xml").

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace = "http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
IV Year/VII Sem 14
IT-T72 Web Services and XML UNIT I
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Advantages
 Supports namespaces
 XML schemas support a set of data types, similar to the ones used in most
common programming languages
 Provides the ability to define custom data types
 Easy to describe allowable content in the document
 Easy to validate the correctness of data
 Object oriented approach like inheritance and encapsulation can be used in
creating the document
 Easier to convert data between different data types
 Easy to define data formats
 Easy to define restrictions on data
 It is written in xml so any xml editor can be used to edit xml schema
Xml Parser can be used to parse the schema file
 XML schemas are more powerful than DTDs. Everything that can be defined by
the DTD can also be defined by schemas, but not vice versa.
Disadvantages
 Complex to design and learn
 Maintaining XML schema for large XML document slows down the processing of
XML document
DTD
 The Document type Definition is used to define the basic building block
of any XML document.
 Using DTD we can specify the various element types, attributes and their
relationship with one another.
 Basically DTD is used to specify the set of rules for structuring data in
any XML file.
Various building blocks of XML are:
 Elements.
 Attribute.
 CDATA
 PCDATA
Elements:

IV Year/VII Sem 15
IT-T72 Web Services and XML UNIT I
 The basic entity is element. The elements are used for defining the tags .The
elements typically consists of opening and closing tags. Mostly only one element
is used to define a single tag.
<!ELEMENT student(name,address,std,marks)>

Attribute:
 The attributes are generally used to specify the values of the element. These are
specified within the double quotes.
<flag type=”true”>
CDATA:
 CDATA stands for character data. This character data will be parsed by the
parser.
 The term CDATA is used about text data that should not be parsed by the XML
parser.
 Characters like "<" and "&" are illegal in XML elements.
 "<" will generate an error because the parser interprets it as the start of a new
element.
 "&" will generate an error because the parser interprets it as the start of an
character entity.
 Some text, like JavaScript code, contains a lot of "<" or "&" characters. To avoid
errors script code can be defined as CDATA.
PCDATA:
 It stands for Parsed Character Data.
 Any Parsable character should not contain the markup characters.
 XML parsers normally parse all the text in an XML document.
 When an XML element is parsed, the text between the XML tags is also parsed:
<message>This text is also parsed</message>
 Parsed Character Data (PCDATA) is a term used about text data that will be
parsed by the XML parser.
< !ELEMENT name(#PCDATA)>
Types of DTD:
1. internal DTD
2. External DTD
Internal DTD:
 Internal DTD file is within the DTD elements in XML file:
IV Year/VII Sem 16
IT-T72 Web Services and XML UNIT I
<?xml version=”1.0” encoding=”UTF=8”?>
<!DOCTYPE student[
<!ElEMENT student(name,address,place)>
<!ELEMENT name(#PCDATA)>
<!ELEMENT address(#PCDATA)>
<!ELEMENT place(#PCDATA)> ]>
<student>
<name>ARUN</name>
<address>KK NAGAR</address>
<place>Villupuram</place>
</student>

External DTD:
 External DTD file is created and its name must be specified in the
corresponding XML file.
DTD File: (student.dtd)
<!ElEMENT student(name,address,place)>
<!ELEMENT name(#PCDATA)>
<!ELEMENT address(#PCDATA)>
<!ELEMENT place(#PCDATA)>
XML File:( externaldtd.xml)
<?xml version=”1.0” encoding=”UTF=8”?>
<!DOCTYPE student SYSTEM “student.dtd”>
<student>
<name>ARUN</name>
<address>KK NAGAR</address>

IV Year/VII Sem 17
IT-T72 Web Services and XML UNIT I
<place>Villupuram</place>
</student>

Merits of DTD:
 It is used to define the structural components of XML documents.
 These are relatively simple and compact.
 DTD can be defined inline and external in the XML documents.
Demerits of DTD:
 DTD are very basic and hence cannot be much specified for complex
documents.
 DTD are not aware of namespace concept.
 The DTD cannot define the type of data contained within the XML documents.
Hence using DTD we cannot specify whether the element is numeric or string
data types.
 Some XML processor which do not understand DTD elements.
4. Explain DOM and SAX based XML processing with example. (Nov 16, Apr 17)
DOM based XML Processing
 The primary goal of any XML processor is to parse the given XML document.
 Java has a rich source of in-built APIs for parsing the given XML document.
 It is parsed in two ways:
1. Tree based Parsing (DOM)
2. Event based Parsing (SAX)

 The XML DOM contains methods (functions) to traverse XML trees, access,
insert, and delete nodes.
 However, before an XML document can be accessed and manipulated, it must
be loaded into an XML DOM object.

IV Year/VII Sem 18
IT-T72 Web Services and XML UNIT I
 An XML parser reads XML, and converts it into an XML DOM object that can be
accessed with JavaScript.
 Most browsers have a built-in XML parser.
Example:

ParsingDomDemo.java

import java.io.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.*;

public class ParsingDomDemo{


public static void main(String[] arg){
try
{
System.out.println("Enter the name of XML document");
BufferedReader input=new BufferedReader(new
InputStreamReader(System.in));
String file_name=input.readLine();
File fp=new File(file_name);
if(fp.exists()){
try{
DocumentBuilderFactory Factory_obj=
DocumentBuilderFactory.newInstance();
DocumentBuilder builder=Factory_obj.newDocumentBuilder();
InputSource ip_src=new InputSource(file_name);
Document doc = builder.parse(ip_src);

System.out.println(file_name+"is well-formed!!!");
}catch(Exception e)
{
System.out.println(file_name+"isn't well-formed!");
System.exit(1);
}
}
else
{
System.out.print("File not found");
}
}catch(IOException ex)
{
ex.printStackTrace();
}
IV Year/VII Sem 19
IT-T72 Web Services and XML UNIT I
}
}

XML: student.xml
<?xml version="1.0"?>
<student>
<Roll_No>10</Roll_No>
<Personal_Info>
<Name>Parth</Name>
<Address>Pune</Address>
<Phone>123456</Phone>
</Personal_Info> Purposely made this
<Class>Second</Class> statement like this! (It is
<Subject>Mathematics</Subject> not well formed
<Marks>100

<Roll_No>20</Roll_No>
<Personal_Info>
<Name>Anuradha</Name>
<Address>Banglore</Address>
<Phone>156438</Phone>
</Personal_Info>
<Class>Fifth</Class>
<Subject>English</Subject>
<Marks>90</Marks>

<Roll_No>30</Roll_No>
<Personal_Info>
<Name>Anandh</Name>
<Address>Mumbai</Address>
<Phone>7678453</Phone>
</Personal_Info>
<Class>Fifth</Class>
<Subject>English</Subject>
<Marks>90</Marks>
</student>

IV Year/VII Sem 20
IT-T72 Web Services and XML UNIT I

Explanation:
1. The javax.xml.parsers.* package provides the classes allowing the processing of
XML documents. It supports various classes such as DocumentBuilder and
DocumentBuilderFactory.
2. The package org.w3c.dom.* provides the interface for Document Object Model
which is a component API for XML processing.
3. The package org.xml.sax.* provides the classes and interface for simple API for
XML (SAX) which is a component API for JAVA API.
4. Reading the name of XML document using the command prompt. Using input
stream for the BufferReader class we can read the content from the command
prompt.
5. DocumentBuilderFactory is a Factory API an application can obtain parser.
This parser basically produces DOM object tree from the given XML document.
Using the object of DocumentBuilderFactory an instance for DocumentBuilder
is created. The object of DocumentBuilder is used to invoke a method parse.
This method takes XML document as an input, parse it. If the XML document is
well formed then appropriate message will be displayed in the command
prompt. In an XML document if every starting tag has an ending tag then the
document is said to be well formed otherwise it is not.
6. In the try block we are calling the method parse for parsing the XML document.
If the XML document is not well formed then the control of the program will go
to catch block.
Event-oriented Parsing: SAX
SAX Parser
 The Simple API for XML (SAX) is a serial access parser API for XML. It is used to
read, update, create and manipulate an XML document.

IV Year/VII Sem 21
IT-T72 Web Services and XML UNIT I
 Whenever the XML document executes, the SAX parser recognizes and
responds to each XML structure taking some specified action based on the
structure type.
 It is an event-driven model for processing XML, which implements the
technique to register the handler to invoke the callback methods whenever an
event is generated. Event is generated when the parser encounters a new XML
tag or encounters an error, or wants to tell anything else.

Example:
EmployeeDetails.java
import javax.xml.parsers.*;
import org.xml.sax.*;
import org.xml.sax.helpers.*;
import java.io.*;
public class EmployeeDetails
{
public static void main(String[] args) throws IOException
{
BufferedReader bf = new BufferedReader(new
InputStreamReader(System.in));
System.out.print("Enter XML file name:");
String xmlFile = bf.readLine();
EmployeeDetails detail = new EmployeeDetails(xmlFile);
}

public EmployeeDetails(String str)


{
try
{
File file = new File(str);
if (file.exists())
{
SAXParserFactory parserFact = SAXParserFactory.newInstance();
SAXParser parser = parserFact.newSAXParser();
System.out.println("XML Data: ");
DefaultHandler dHandler = new DefaultHandler()
{
boolean id;
boolean name;
boolean mail;

IV Year/VII Sem 22
IT-T72 Web Services and XML UNIT I

public void startElement(String uri, String localName, String


element_name, Attributes attributes)throws SAXException
{
if (element_name.equals("Emp_Id"))
{
id = true;
}
if (element_name.equals("Emp_Name"))
{
name = true;
}
if (element_name.equals("Emp_E-mail"))
{
mail = true;
}
}

public void characters(char[] ch, int start, int len) throws SAXException
{
String str = new String (ch, start, len);
if (id)
{
System.out.println("Emp_Id: "+str);
id = false;
}
if (name)
{
System.out.println("Name: "+str);
name = false;
}
if (mail)
{
System.out.println("E-mail: "+str);
mail = false;
}
}
};

parser.parse(str, dHandler);
}
else
{
System.out.println("File not found!");
}
}

IV Year/VII Sem 23
IT-T72 Web Services and XML UNIT I
catch (Exception e)
{
System.out.println("XML File hasn't any elements");
e.printStackTrace();
}
}
}
Employee-Detail.xml
<?xml version = "1.0" ?>
<Employee-Detail>
<Employee>
<Emp_Id> 11032 </Emp_Id>
<Emp_Name> Hari </Emp_Name>
<Emp_E-mail> [email protected] </Emp_E-mail>
</Employee>
<Employee>
<Emp_Id> 11022 </Emp_Id>
<Emp_Name> Ashok kumar </Emp_Name>
<Emp_E-mail> [email protected] </Emp_E-
mail>
</Employee>
<Employee>
<Emp_Id> 11011 </Emp_Id>
<Emp_Name> Elavarasan </Emp_Name>
<Emp_E-mail> [email protected] </Emp_E-mail>
</Employee>
</Employee-Detail>
Output:
H:\WT LAB\Programs\Ex8>java EmployeeDetails
Enter XML file name:Employee-Detail.xml
XML Data:
Emp_Id: 11032
Name: Hari
E-mail: [email protected]
Emp_Id: 11022
Name: Ashok kumar
E-mail: [email protected]
Emp_Id: 11011
Name: Elavarasan
E-mail: [email protected]
IV Year/VII Sem 24
IT-T72 Web Services and XML UNIT I
Description of program:
 In this example you need a well-formed XML file that has some data (Emp_Id,
Emp_Name and Emp_E-mail in our case).
 Create a java program (EmployeeDetails.java) that retrieves data from it. When
you run the program it asks for a file with a message "Enter XML file name:" at
the command line and checks its existence through exists() method.
 If the given file exits, the instance of SAXParser class parses the file using the
parse() method.
 Till the startElement() method returns 'true', the characters() method prints
data .
 If the file doesn't exist it will display a message "File not found!".
 Characters(char[] ch, int start, int len) method retrieves identification of
character data. The Parser calls this method and to report every character data
encountered .
If any error occurs it throws the SAXException. This method takes the following
parameters:
ch: This is the characters of XML document.
start: This is staring position in an array.
len: This is the number of characters to read from an array.
Advantages of SAX:
1. Sax is event based parsing method used to parse the given XML
document.
2. The parsing can be done using sequence of events or using some
handler functions.
3. The parsing of XML document is done by node by node and this
method does not require much memory consumption because the
complete XML document need not be stored in memory.
4. Insertion and deletion of nodes is possible.
5. Explain in detail about the presentation technologies in XML.
PRESENTATION TECHNOLOGIES:
1) XSL
2) XFORMS
3) XHTML
XSL & XSLT:
 XSL stands for EXtensible Stylesheet Language.
What is XSLT?
IV Year/VII Sem 25
IT-T72 Web Services and XML UNIT I
 XSLT stands for XSL Transformations. XSLT is the most important part of XSL.
XSLT transforms an XML document into another XML document. XSLT uses
XPath to navigate in XML documents. XSLT is a W3C Recommendation.
We want to transform the following XML document ("cdcatalog.xml") into XHTML:
<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
<cd>
<title>Empire Burlesque</title>
<artist>Bob Dylan</artist>
<country>USA</country>
<company>Columbia</company>
<price>10.90</price>
<year>1985</year>
</cd> . . .
</catalog>
Then you create an XSL Style Sheet ("cdcatalog.xsl") with a transformation template:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th align="left">Title</th>
<th align="left">Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr>
</xsl:for-each>
</table>
</body>
IV Year/VII Sem 26
IT-T72 Web Services and XML UNIT I
</html>
</xsl:template>
</xsl:stylesheet>
The result is:

XFORMS:
 XForms is the next generation of HTML forms. XForms is richer and more
flexible than HTML forms. XForms will be the forms standard in XHTML 2.0.
XForms is platform and device independent.
 XForms separates data and logic from presentation. XForms uses XML to define
form data.
 XForms stores and transports data in XML documents.
 XForms contains features like calculations and validations of forms. XForms
reduces or eliminates the need for scripting. XForms is a W3C
Recommendation. The XForms Model. The XForms model is used to describe
the data.
The data model is an instance (a template) of an XML document. The XForms model
defines a data model inside a <model> element:
<model>
<instance>
<person>
<fname/>
<lname/>
</person>
</instance>
<submission id="form1" action="submit.asp" method="get"/>
IV Year/VII Sem 27
IT-T72 Web Services and XML UNIT I
</model>
The XForms Model
The XForms model is used to describe the data. The data model is an instance (a
template) of an XML document.
The XForms model defines a data model inside a <model> element:
<model>
<instance>
<person>
<fname/>
<lname/>
</person>
</instance>
<submission id="form1" action="submit.asp" method="get"/>
</model>
All together it looks as below
<xforms>
<model>
<instance>
<person><fname/><lname/></person>
</instance>
<submission id="form1" action="submit.asp" method="get"/>
</model>
<input ref="fname">
<label>First Name</label>
</input>
<input ref="lname">
<label>Last Name</label>
</input>
<submit submission="form1">
<label>Submit</label>
</submit>
</xforms>
Output seems like:

IV Year/VII Sem 28
IT-T72 Web Services and XML UNIT I

XHTML:
 XHTML stands for EXtensible HyperText Markup Language.
 XHTML is aimed to replace HTML. XHTML is almost identical to HTML 4.01.
XHTML is a stricter and cleaner version of HTML.
 XHTML is HTML defined as an XML application.
 XHTML is a W3C Recommendation. XHTML elements must be properly nested.
XHTML elements must always be closed. XHTML elements must be in
lowercase.
XHTML documents must have one root element.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<title>simple document</title>
</head>
<body><p>a simple paragraph</p></body>
</html>
The 3 Document Type Definitions :
1) DTD specifies the syntax of a web page in SGML.
2) DTD is used by SGML applications, such as HTML, to specify rules that apply to the
markup of documents of a particular type, including a set of element and entity
declarations.
3) XHTML is specified in an SGML document type definition or 'DTD'.
An XHTML DTD describes in precise, computer-readable language, the allowed syntax
and grammar of XHTML markup.
There are currently 3 XHTML document types:
i. STRICT
ii. TRANSITIONAL
iii. FRAMESET

IV Year/VII Sem 29
IT-T72 Web Services and XML UNIT I
XHTML 1.0 specifies three XML document types that correspond to three DTDs:
i. Strict
ii. Transitional
iii. Frameset

XHTML 1.0 Strict:


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
We can use this when you want really clean markup, free of presentational clutter. We
can use this together with Cascading Style Sheets.
XHTML 1.0 Transitional:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
We can use this when you need to take advantage of HTML's presentational features
and when you want to support browsers that don't understand Cascading Style
Sheets.
XHTML 1.0 Frameset:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
We can use this when you want to use HTML Frames to partition the browser window
into two or more frames.
6. Explain in detail about the Transformation of XML Documents.
TRANSFORMATION:
• XSLT
• XLINK
• XPATH
• XQuery
XSLT:
 XSLT stands for XSL Transformations
 XSLT is the most important part of XSL
 XSLT transforms an XML document into another XML document
 XSLT uses XPath to navigate in XML documents
 XSLT is a W3C Recommendation.
IV Year/VII Sem 30
IT-T72 Web Services and XML UNIT I
 XSLT is used to transform an XML document into another XML document, or
another type of document that is recognized by a browser, like HTML and
XHTML. Normally XSLT does this by transforming each XML element into an
(X)HTML element.
 With XSLT you can add/remove elements and attributes to or from the output
file. You can also rearrange and sort elements, perform tests and make
decisions about which elements to hide and display, and a lot more.
 A common way to describe the transformation process is to say that XSLT
transforms an XML source-tree into an XML result-tree.

Explanation:
 XSLT processor take two input document one is XML document and
another is XSLT document.
 The XSLT document is nothing but a program and XML document is
nothing but the input data, thus this program works on the XML input
data.
 Then some or whole part of XML document is selected, modified and
merged with XSLT program document in order to produce another
document.This newly produced document is provided as input to the
XSLT processor which in turn produce another document called the XSL
document.
 The XSL document is used along with the application so that particular
application can be displayed on the web browser in some desired
manner.
How does it Work?
In the transformation process, XSLT uses XPath to define parts of the source
document that should match one or more predefined templates. When a match is

IV Year/VII Sem 31
IT-T72 Web Services and XML UNIT I
found, XSLT will transform the matching part of the source document into the result
document.
Example:
Emp.xml
<?xml version = "1.0" ?>
<?xml-stylesheet href="emp.xsl" type="text/xsl"?>
<Employee-Detail>
<Employee>
<Emp_Id> 11032 </Emp_Id>
<Emp_Name> Sashini </Emp_Name>
<Emp_E-mail> [email protected] </Emp_E-mail>
</Employee>
<Employee>
<Emp_Id> 11022 </Emp_Id>
<Emp_Name> Mathi</Emp_Name>
<Emp_E-mail> [email protected] </Emp_E-mail>
</Employee>
<Employee>
<Emp_Id> 11011 </Emp_Id>
<Emp_Name> Arun </Emp_Name>
<Emp_E-mail> [email protected] </Emp_E-mail>
</Employee>
</Employee-Detail>
Emp.xsl
<?xml version="1.0" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="html" indent="yes"/>
<xsl:template match="/">
<html>
<title>XSLT Style Sheet</title>
<body>
<h1><p align="center">Employee Details</p></h1>
<xsl:apply-templates/>
</body>
</html>
IV Year/VII Sem 32
IT-T72 Web Services and XML UNIT I
</xsl:template>
<xsl:template match="Employee-Detail">
<table border="2" width="50%" align="center">
<tr bgcolor="LIGHTBLUE">
<td><b>Emp_Id</b></td>
<td><b>Emp_Name</b></td>
<td><b>Emp_E-mail</b></td>
</tr>
<xsl:for-each select="Employee">
<tr>
<td><i><xsl:value-of select="Emp_Id"/></i></td>
<td><xsl:value-of select="Emp_Name"/></td>
<td><xsl:value-of select="Emp_E-mail"/></td>
</tr>
</xsl:for-each>
</table>
</xsl:template>
</xsl:stylesheet>
Output:

XLINK:
XLink Syntax:
 In HTML, we know (and all the browsers know!) that the <a> element defines a
hyperlink. However, this is not how it works with XML.

IV Year/VII Sem 33
IT-T72 Web Services and XML UNIT I
 In XML documents, you can use whatever element names you want - therefore
it is impossible for browsers to predict what hyperlink elements will be called in
XML documents.
The solution for creating links in XML documents was to put a marker on elements
that should act as hyperlinks.
Example:
<?xml version="1.0"?>
<homepages xmlns:xlink="http://www.w3.org/1999/xlink">
<homepage xlink:type="simple" xlink:href="http://www.w3schools.com">
Visit W3Schools
</homepage>
<homepage xlink:type="simple" xlink:href="http://www.w3.org">
Visit W3C
</homepage>
</homepages>

XPATH
 XPath is a syntax for defining parts of an XML document
 XPath uses path expressions to navigate in XML documents
 XPath contains a library of standard functions
 XPath is a major element in XSLT
 XPath is a W3C recommendation

IV Year/VII Sem 34
IT-T72 Web Services and XML UNIT I
XPath Terminology
Nodes
 In XPath, there are seven kinds of nodes: element, attribute, text,
namespace, processing-instruction, comment, and document
nodes.
 XML documents are treated as trees of nodes. The topmost
element of the tree is called the root element.
Look at the following XML document:
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book>
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
Example of nodes in the XML document above:
<bookstore> (root element node)
<author>J K. Rowling</author> (element node)
lang="en" (attribute node)
Items : Items are atomic values or nodes.
Relationship of Nodes
Parent: Each element and attribute has one parent.
In the above example; the book element is the parent of the title, author, year, and
price:
Children: Element nodes may have zero, one or more children.
In the above example; the title, author, year, and price elements are all children of the
book element:
Siblings: Nodes that have the same parent.
In the above example; the title, author, year, and price elements are all siblings:
Ancestors: A node's parent, parent's parent, etc.
In the above example; the ancestors of the title element are the book element and the
bookstore element:

IV Year/VII Sem 35
IT-T72 Web Services and XML UNIT I
Descendants: A node's children, children's children, etc.
In the above example; descendants of the bookstore element are the book, title,
author, year, and price elements:
The XML Example Document
We will use the following XML document in the examples below.
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book>
<title lang="eng">Harry Potter</title>
<price>29.99</price>
</book>
<book>
<title lang="eng">Learning XML</title>
<price>39.95</price>
</book>
</bookstore>
Selecting Nodes
XPath uses path expressions to select nodes in an XML document. The node is
selected by following a path or steps. The most useful path expressions are listed
below:

Expression Description

Nodename Selects all nodes with the name "nodename"

/ Selects from the root node

// Selects nodes in the document from the current node that


match the selection no matter where they are

. Selects the current node

.. Selects the parent of the current node

@ Selects attributes

In the table below we have listed some path expressions and the result of the
expressions:

Path Expression Result

Bookstore Selects all nodes with the name "bookstore"

/bookstore Selects the root element bookstore

IV Year/VII Sem 36
IT-T72 Web Services and XML UNIT I
Note: If the path starts with a slash ( / ) it always represents an
absolute path to an element!

bookstore/book Selects all book elements that are children of bookstore

//book Selects all book elements no matter where they are in the
document

bookstore//book Selects all book elements that are descendant of the bookstore
element, no matter where they are under the bookstore element

//@lang Selects all attributes that are named lang

Predicates:

Selecting Unknown Nodes:

IV Year/VII Sem 37
IT-T72 Web Services and XML UNIT I

Selecting several paths:

Example:
<!DOCTYPE html>
<html>
<body>
<script>
function loadXMLDoc(dname)
{
if (window.XMLHttpRequest)
{
xhttp=new XMLHttpRequest();
}
else
{
xhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xhttp.open("GET",dname,false);
xhttp.send("");
return xhttp.responseXML;
}

xml=loadXMLDoc("books.xml");
path="/bookstore/book/title"
// code for IE
if (window.ActiveXObject)
{
var nodes=xml.selectNodes(path);

for (i=0;i<nodes.length;i++)
{
document.write(nodes[i].childNodes[0].nodeValue);
document.write("<br>");
}
}

IV Year/VII Sem 38
IT-T72 Web Services and XML UNIT I
// code for Mozilla, Firefox, Opera, etc.
else if (document.implementation &&
document.implementation.createDocument)
{
var nodes=xml.evaluate(path, xml, null, XPathResult.ANY_TYPE, null);
var result=nodes.iterateNext();

while (result)
{
document.write(result.childNodes[0].nodeValue);
document.write("<br>");
result=nodes.iterateNext();
}
}
</script>
</body>
</html>
Output:
Everyday Italian
Harry Potter
XQuery Kick Start
Learning XML

XQuery:
 XQuery is the language for querying XML data.
 XQuery for XML is like SQL for databases.
 XQuery is built on XPath expressions.
 XQuery is supported by all the major database engines (IBM, Oracle,
Microsoft, etc.).
 XQuery is a W3C Recommendation.
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
IV Year/VII Sem 39
IT-T72 Web Services and XML UNIT I
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
Functions:
 XQuery uses functions to extract data from XML documents.
 The doc() function is used to open the "books.xml" file:
doc("books.xml"), Path Expressions

XQuery uses path expressions to navigate through elements in an XML


document.
The following path expression is used to select all the title elements in the
"books.xml" file: doc("books.xml")/bookstore/book/title (/bookstore selects the
bookstore element, /book selects all the book elements under the bookstore element,
and /title selects all the title elements under each book element),

The XQuery above will extract the following:


<title lang="en">Everyday Italian</title>
<title lang="en">Harry Potter</title>
<title lang="en">Learning XML</title>
Predicates:
 XQuery uses predicates to limit the extracted data from XML documents.
 The following predicate is used to select all the book elements under the
bookstore element that have a price element with a value that is less
than 30:
doc("books.xml")/bookstore/book[price<30]
 The XQuery above will extract the following:
IV Year/VII Sem 40
IT-T72 Web Services and XML UNIT I
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
With FLWOR:
 FLWOR is an acronym for "For, Let, Where, Order by, Return".
 The for clause selects all book elements under the bookstore element into a
variable called $x.
 The where clause selects only book elements with a price element with a value
greater than 30.
 The order by clause defines the sort-order. Will be sort by the title element.
 The return clause specifies what should be returned. Here it returns the title
elements.

Example: doc("books.xml")/bookstore/book[price>30]/title
The following FLWOR expression will select exactly the same as the path
expression above:
for $x in doc("books.xml")/bookstore/book where $x/price>30 return $x/title

The result will be:


<title lang="en">Learning XML</title>

With FLWOR you can sort the result:


for $x in doc("books.xml")/bookstore/book where $x/price>30 order by
$x/title return $x/title

IV Year/VII Sem 41

You might also like