Xmlunit 2
Xmlunit 2
Applications of xml
1. cellephones- xml data is sent to some cellphones.
That data is formatted by specifications of the cellphone software
designer to display text, image or even to play sounds.
2. File converters- Many applications have been written to convert
existing documents into the XML standard.
An example is a PDF to XML converter.
3. VoiceXML - Converts XML documents into an audio format so that
you can listen to an XML document.
4. Ms office also uses its file format in xml.
Difference between HTML and XML:
Xml syntax
• Syntax is used to create well formed xml document.
CDATA:
1. CDATA means character data.
2. CDATA is text that will NOT be parsed by a parser. Tags inside
the text will NOT be treated as markup and entities will not be
expanded
1)XML Element: An XML element is everything from (including)
the element's start tag to (including) the element's end tag including
text data.
Example:
<employee>
<empno>16</empno>
<name>Goutham</name> Elements
<salary>45000</salary>
</employee>
An element can contain:
• Child elements
• attributes
• Text-data
• or a mix of all of the above...
2)XML attributes:
• Attributes provide additional information about an element.
• XML Attributes Must be Quoted
<city state=“ap”> Attribute
<person>
<gender>female</gender>
<person gender="female">
<firstname>Anna</firstname>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
<lastname>Smith</lastname>
</person>
</person>
gender is element
gender is attribute
Entity reference: Some characters have a special meaning in XML.
<person>
<name>Ramesh</name>
<age>age is <18</age>
</person>
• In above example “lessthan symbol” has special meaning
• “<“ is used for opening tag
• In entity reference we will use “<” for lessthan
<person>
<name>Ramesh</name>
<age>age is <18</age>
</person>
User defined Entity ref:
Syntax:
&entity reference name;
XML tree:
• XML documents form a tree structure that starts at "the root"
and branches to "the leaves".
• XML document contains a single element.
• That single element is called root element
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to> Rakesh</to>
<from> Jani </from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
1. Each of our XML files can carry a description of its own format.
<!ELEMENT element-name(content-
model)>
Example:
• <!ELEMENT employee(empno , empname, sal)>
• <!ELEMENT empno(#PCDATA)>
• <!ELEMENT name(#PCDATA)>
• <!ELEMENT sal(#PCDATA)>
The Building Blocks of XML Documents :- All XML documents
(and HTML documents) are made up by the following building
blocks:
1. Elements
2. Attributes
3. Entities
4. PCDATA
5. CDATA
1.DTD ELEMENTS: XML elements can be defined as building blocks
of an XML document. Elements can behave as a container to hold text,
elements, attributes, media objects or mix of all.
A DTD element is declared with an ELEMENT declaration. When an
XML file is validated by DTD, parser initially checks for the root
element and then the child elements are validated.
Syntax:- <!ELEMENT elementname (content)>
From the above syntax
ELEMENT declaration is used to indicate the parser that user
specified about to define an element.
elementname is the element name (also called the generic
identifier) that defining by the user.
content defines what content (if any) can go within the element.
Element Content Types:- Content of elements declaration in a DTD
divided into following types
i. Empty content
ii. Element content/Elements with Parsed Character Data
iii. Mixed content
iv. Any content
i. Empty Content:In the empty content type of element
declaration, The element declaration does not contain any content.
These are declared with the keyword EMPTY.
Syntax:
<!ELEMENT element-name EMPTY>
Eg:
The example above declares that the child element "message" must occur once,
and only once inside the "note" element.
The + sign in the example above declares that the child element "message" must
occur one or more times inside the "note" element.
c. Declaring Zero or More Occurrences of an Element
The * sign in the example above declares that the child element "message" can
occur zero or more times inside the "note" element.
The ? sign in the example above declares that the child element "message" can
occur zero or one time inside the "note" element.
iii).Mixed Element Content:-The combination of (#PCDATA) and
children elements. Within mixed content models, text can appear by
itself or it can be interspersed between elements. The rules for mixed
content models are similar to the element content.
Syntax:- <!ELEMENT elementname (#PCDATA|child1|child2)*>
Value Explanation
====================================
value The default value of the attribute
#REQUIRED The attribute is required
#IMPLIED The attribute is not required
#FIXED value The attribute value is fixed
#REQUIRED:-
Use the #REQUIRED keyword if you don't have an option for a default value, but
still want to force the attribute to be present.
#IMPLIED
Syntax
Example
DTD: <!ATTLIST contact fax CDATA #IMPLIED>
Valid XML: <contact fax="555-667788" />
Valid XML: <contact />
Use the #IMPLIED keyword if you don't want to force the author to
include an attribute, and you don't have an option for a default value.
#FIXED
Syntax
<!ATTLIST element-name attribute-name attribute-type #FIXED "value">
Example
Use the #FIXED keyword when you want an attribute to have a fixed
value without allowing the author to change it. If an author includes
another value, the XML parser will return an error.
• Attribute Types
• When declaring attributes, you can specify how the processor
should handle the data that appears in the value.
• We can categorize attribute types in three main categories −
• String type
• Tokenized types
• Enumerated types
2. Enumerated Attribute Values:- It is used to specify list of values.
This attribute allows any one of the value from the specified list.
Syntax:-
Use enumerated attribute values when you want the attribute value to be one of a
fixed set of legal values.
EG:-att1.xml
2. ID Attribute Values:-It is unique type and start with _ or A-Z or a-z, should not
strat With digit.
Eg:-att2.xml
3.IDREF Attribute Values:-
example-1
<!DOCTYPE bookstore [
CDATA:
1. CDATA means character data.
2. CDATA is text that will NOT be parsed by a parser. Tags inside
the text will NOT be treated as markup and entities will not be
expanded
TYPES OF DTD’S:
• 1.Internal DTD
• 2.External DTD
• Internal DTD:
• If the DTD is declared inside the XML file, it should be
wrapped in a DOCTYPE definition with the following
syntax:
<!DOCTYPE root-element [elementdeclarations]>
• Internal DTD’s are specific to XML document.
• Internal DTD’s are not reusable.
Example:
<!DOCTYPE employee [
<!ELEMENT employee (empno,empname,sal)>
<!ELEMENT empno (#PCDATA)>
<!ELEMENT empname (#PCDATA)>
<!ELEMENT sal (#PCDATA)>
]>
<employee>
<empno>1216</empno>
<empname>ram</empname>
<sal>34000</sal>
</employee>
External DTD Declaration: If the DTD is declared in an
external file, it should be wrapped in a DOCTYPE definition
with the following syntax:
<!DOCTYPE root-element SYSTEM/PUBLIC “url of file dtd">
• External DTD’s are two types:
a. Private DTD’s
b. Public DTD’s
a.Private DTD’s:
<!DOCTYPE root-element SYSTEM “url of file dtd">
Syntax:
– General entity references start with & and end with ;
– The entity reference is replaced by its true value when parsed.
– The characters < > & “ ‘ require entity references to avoid
conflicts with the XML application ( parser )
74
Types of Entity
1. Internal Entity: declared within DTD
syntax:-
<!ENTITY entity-name "entity-value">
75
a. An Internal Entity Declaration
Syntax
• <!ENTITY entity-name "entity-value">
Eg:-
DTD Example:
• <!ENTITY writer "Donald Duck.">
• <!ENTITY copyright "Copyright W3Schools.">
• XML example:
• <author>&writer;©right;</author>
2. External Entity: Included in the different file and referred in
the xml file.
DTD Example:
<!ENTITY writer SYSTEM "http://www.w3schools.com/entities.dtd">
or <!ENTITY writer SYSTEM “d:\test.txt”
<!ENTITY copyright SYSTEM
"http://www.w3schools.com/entities.dtd">
XML example:
<author>&writer; ©right;</author>
Disadvantages of DTD
1. DTD does not follow the XML syntax it requires new syntax.
2. Namespace does not supported
3. No data types.
4. No modularity and no reuse of elements.
5. No inheritance for elements or attributes
6. DTD is old technique.
Namespace Declaration:- A Namespace is a set of unique names.
Namespace is a mechanisms by which element and attribute name can
be assigned to a group. The Namespace is identified by URI(Uniform
Resource Identifiers).
<t:wt>
<t:unit-1>html</t:unit-1>
<t:unit-2>CSS</t:unit-2>
</t:wt>
<s:wt>
<s:unit-1>introduction to internet</s:unit-1>
<s:unit-2>html</s:unit-2>
</s:wt>
XSD- XML Schema Definition:
1. XML Schema is an XML-based alternative to DTD.
XSDs can be extensible for future additions. XSD is richer and more
powerful than DTD.
What is an XML Schema?
The purpose of an XML Schema is to define the legal building
blocks of an XML document, just like a DTD.
An XML Schema:
1. defines elements that can appear in a document
2. defines attributes that can appear in a document
3. defines which elements are child elements
4. defines the order of child elements
5. defines the number of child elements
6. defines whether an element is empty or can include text
7. defines data types for elements and attributes
8. defines default and fixed values for elements and attributes
XSD Elements:In XSD two ways to create Elements.
i. Simple Element (ii) Complex Element
======================================
i. Simple Element:A simple element is an XML element that can
contain only text data. It cannot contain any other elements or
attributes.
Syntax:
<xs:schema>
<xs:element name=“element name” type=“xs:data
type”>
</xs:element>
</xs:schema>
Example:
<xs:schema>
<xs:element name=“EmpNo” type=“xs:int”>
<xs:element name=“EmpName” type=“xs:string”>
</xs:element>
</xs:schema>
Writing Simple XML Schema
Step 1: Write a Simple schema file to define the structure of XML file
and save it as .XSD extension .
Step 2: Write an XML Document for the Defined Schema .
Step 3: Execute the XML in Browser or XML Editor .
XSD - The <schema> Element: The <schema> element is the root
element of every XML Schema.
Synta:- <xs:schema>
...
...
</xs:schema>
• The <schema> element may contain some attributes.
• Data Types in XSD:
• Primitive types-19: • Built-in- derived Data Types:
• String, • normalizedString
• boolean, • token,
• intger,
• decimal, • language,
• • NMTOKEN, • nonPositiveInteger,
float,
• double, • NMTOKENS • negativeInteger,
• duration,
• Name, • long,
• NCName,
• dateTime, • int,
• ID,
• time, • IDREF, • short,
• date, • IFREFS, • byte,
• gYearMonth, • ENITIY, • nonNegativeIntege
• gYear,gMonthDay, • ENTITIES,
• gDay,gMonth,
r
• nexbinary,
• unsignedLong,
• base64Binary, • unsignetInt
• anyURI, • unsignedShort,
• Qname, • unsignedByte,
• NOTATION. • positiveInteger
example
<student>
<sname>sunny1</sname>
<rollno>5n5</rollno>
<marks>9.9</marks>
<mobileno>97045326</mobileno>
</student>
***********************************
<xs:element type="xs:string" name="sname"/>
<xs:element type="xs:string" name="rollno"/>
<xs:element type="xs:float" name="marks"/>
<xs:element type="xs:int" name="mobileno"/>
Syntax
<xs:attribute name="xxx" type="yyy"/>
---------------------------------------------------------------------
<http://preminfo.com:employee>
< http://preminfo.com:empno >1216</ http://preminfo.com:empno >
< http://preminfo.com:empname >Ram</ http://preminfo.com:empname >
< http://preminfo.com:salary >45000< /http://preminfo.com:salary >
</http://preminfo.com:employee>
Writing fully qualified name with elements it is over
burden to over come this problem we are using XMLNS.
• <employee XMLNS:”http://preminfo.com”>
• <empno>1216</empno>
• <empname>Ram</empname>
• <salary>45000<salary/>
• </employee>
---------------------------------------------------------------
• It also possible to create prefix for XMLNS
• <employee XMLNS:e=”http://preminfo.com”>
• <e:empno>1216</e:empno>
• <e:empname>Ram</e:empname>
• <e:salary>45000<e:salary/>
• </e:employee>
In XSD one target NameSpace declaration is posssible.
• In XML any no.of XMLNS declaration are possible.
• Schema
• complexType http://www.w3.org/2001/xml/schema
• Sequence etc
• It is also possible to define XMLNS in XSD
<xs:schema>
<xs:element name=“element name”>
<xs:complexType>
<xs:sequence>
<xs:element name=“child1” type=“xs:datatype”/>
<xs:element name=“child2” type=“xs:datatype”/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Example XSD:
<xs:schema>
<xs:element name=“employee”>
<xs:complexType>
<xs:sequence>
<xs:element name=“EmpNo” type=“xs:int”/>
<xs:element name=“EmpName” type=“xs:string”/>
<xs:element name=“EmpSalary” type=“xs:decimal”/>
</xs:sequence>
</xs:complexType>
XML for above XSD
</xs:element> <employee>
<EmpNo>1216</EmpNo>
</xs:schema> <EmpName>Sam</EmpName>
<EmpSalary>35000</EmpSalary>
</employee>
2.element can have a type attribute that refers to the name of the
complex type to use:
• <xs:element name="employee" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
• If you use the method described above, several elements can refer
to the same complex type, like this:
• <xs:element name="employee" type="personinfo"/>
<xs:element name="student" type="personinfo"/>
<xs:element name="member" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
There are four kinds of complex elements:
• 1.empty elements
• 2.elements that contain only other elements
• 3.elements that contain only text
• 4.elements that contain both other elements and text
• Note: Each of these elements may contain attributes as
well!
1.empty elements:-
<productprodid="1345"/>
<xs:element name="product">
<xs:complexType>
<xs:attribute name="prodid“ type= "xs:positiveInteger"/>
</xs:complexType>
</xs:element>
2. Complex Types Containing Elements Only:-
XML
<person>
<firstname>John</firstname>
<lastname>Smith</lastname>
</person>
XML schema
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Complex Text-Only Elements:- it contains simple Content(text/attributes)
OR
<xs:element name="somename">
<xs:complexType> <xs:element name="somename">
<xs:simpleContent> <xs:complexType>
<xs:extension base="basetype"> <xs:simpleContent>
.... <xs:restriction base="basetype">
.... ....
</xs:extension> ....
</xs:simpleContent> </xs:restriction>
</xs:complexType> </xs:simpleContent>
</xs:element> </xs:complexType>
</xs:element>
Complex Text-Only Elements:- it contains simple content(text/attributes
Example- XML
<carcost Cname=“swift”>600000</carcost>
XML schema
<xs:element name=“carcost">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:integer">
<xs:attribute name=“Cname" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
Complex Types with Mixed Content:- An XML element that
contains both text and other elements:
XML
<address>
To,<name>Sree Ram</name>
Flat-no-207<aptname>S.S.Heavens</aptname>
<city>hyderabad</city>
</address>
XML schema
<xs:element name=“address">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name=“aptname" type="xs:string"/>
<xs:element name=“city" type="xs:hyderabad"/>
</xs:sequence>
</xs:complexType>
</xs:element>
XSD Indicators: Indicators control the way how elements are to be
organized in an XML document.
The indicators used to control the elements presentation in the
documents.
There are seven types of indicators, falling into three broad
categories.
Order Indicators:
All − Child elements can occur in any order.
Choice − Only one of the child element can occur.
Sequence − Child element can occur only in specified order.
Occurence Indicators:
maxOccurs − Child element can occur only maxOccurs number of
times.
minOccurs − Child element must occur minOccurs number of times.
Group Indicators:
Group − Defines related set of elements.
attributeGroup − Defines related set of attributes.
Order Indicators:the <xs:all> indicate - the child elements described
in the xsd schema can appear in the xml document in any order.
• The child elements described in the xsd schema can appear in the
xml document in any order.
Occurrence indicator :Occurrence indicators are used to define the
frequency of an element occur.
• Note: For all of the "Order" and "Group" indicator (any, all, choice,
sequence , group name and group reference), which maxOccurs and
minOccurs defaults are 1.
• To make the number of occurrences of an element is not limited,
please use the maxOccurs = "unbounded" this statement:
Group Indicators:-<group> is used to group a related set of elements.
The elements and attributes - groups can then be referenced in the
definition of complex types, as shown below:
XML DOM: The Document Object Model (DOM) is a W3C standard.
The DOM presents an XML document as a tree-structure.It defines a
standard for accessing documents like HTML and XML.
Definition: The Document Object Model (DOM) is an application
programming interface (API) for HTML and XML documents.
• It defines the logical structure of documents and the way a
document is accessed and manipulated.
• DOM defines the objects and properties and methods (interface) to
access all XML elements.
• It is separated into 3 different parts / levels −
• Core DOM − standard model for any structured document
• HTML DOM − standard model for HTML documents
• XML DOM − standard model for XML documents
XML DOM is Defined For :
1. Loading the XML Files
2. Accessing the elements of XML Documents .
3. Deleting the Elements of XML Documents .
4. Changing the Elements of XML Documents.
Loading an XML File
• Step 1: Create an empty xmlDocument Object .
Syntax : xmlDocument =new ActiveXObject(Microsoft.XMLDOM);
• Step 2: To Continue the Execution after loading Set the
xmlDocument.async=false ;
• Step 3: Specify the name of the XML file to load .
Syntax : xmlDocument.load(“xmldocument”);
• It is possible to write both internal and External Functions to load an
XML File using DOM Object .
Properties and Methods of XML DOM
• Properties are meant for accessing the XML elements and Methods
are used to perform some actions on the XML Elements .
XML DOM Nodes: In the DOM, everything in an XML document is a
node.
The entire document is a document node
Every XML element is an element node
The text in the XML elements are text nodes
Every attribute is an attribute node
Comments are comment nodes
XML DOM properties:-
1.nodeName --->Find the name of the node
2.nodeValue --->Obtain the value of the node
3.parentNode ---> Getting the parent Node Name
4.childNode ---> Obtain the Child Nodes of parent
5.attributes ---> Getting the attribute value of nodes
6.documentElement ---> Get the Root element of Document .
7.firstChild ---> Access the first child of node .
8.nextSibling ---> Access the Sibling elements .
9.nodeType ---> To specify the Type of Node
1-Element,2-attribute , 3-text ,9- Document ,8-comment .
XML DOM Methods:-
1.getElementsByTagName(name) ---> get the Elements of Specified tag
name .
2.appendChild(node) ---> To insert a Child Node .
3.createElement(“newNodeName”)---> To create A node .
4.createTextNode(“valuefornode”)---> To create a value for node .
5.replaceChild(newnode,oldnode);
6.removeChild(node) ---> To remove a Child Node .
7.replaceData(offset,length,replacement) .
8.getAttribute(tagname).
9.setAttribute(“attribute”,”value”) .
10.removeAttribute(“attributename”);
XML Parsers :- To read and update, create and manipulate an XML
document, you will need an XML parser.
• An XML parser is a software library or package that provides
interfaces for client applications to work with an XML
document.
• The XML Parser is designed to read the XML and create a way
for programs to use XML.
• XML parser validates the document and check that the
document is well formatted.
• XML Processors are used to parse the given XML document .
There are two ways to parse the XML document .
1.Tree Based Parsing
2.Event Based Parsing
• There are two types of processors
1.DOM Parser (Document Object Model)
2.SAX Parser (Simple API for XML)
1.DOM Parser (Document Object Model):-A DOM document is an
object which contains all the information of an XML document. It is
composed like a tree structure. The DOM Parser implements a DOM
API. This API is very simple to use.
Features of DOM Parser:-
A DOM Parser creates an internal structure in memory which is a
DOM document object and the client applications get information
of the original XML document by invoking methods on this
document object.
DOM Parser has a tree based structure.
Advantages
1) It supports both read and write operations and the API is very
simple to use.
2) It is preferred when random access to widely separated parts of a
document is required.
Disadvantages
1) It is memory inefficient. (consumes more memory because the
whole XML document needs to loaded into memory).
2) It is comparatively slower than other parsers.