Web Notes by Nandan U3
Web Notes by Nandan U3
Unit -3
Introduction to XML
Introduction
XML stands for eXtensible Markup Language.
XML is a markup language same as HTML.
XML tag you can define yourself for storing data.
XML store (describe) data, nothing more (either temporary or permanent).
XML was designed for describing well-formed data. Also XML have some strict rules.
That rules follow every XML document.
Any database data are easily transformed to a XML format. It's like reasonable storage
format for certain types of data and easily converted into server side along with XSL, etc.
Data can be inserted or updated into the database tables corresponding to the objects
using XML files.
Following are some key point differences between XML and HTML.
Key Point XML HTML
stands for eXtensible Markup Language Hyper Text Markup Language
XML derived from SGML(Standard Where as the HTML derived same
Derived from
Generalized Markup Language). from SGML.
XML was designed for holds HTML was designed for specify how
data. Use for transport
Purpose to data should be display on web
data between application and
database. page.
XML was follow strict rules. Any HTML was not following any strict
Rules time terminate the process if rules rules. All browser try to display data
break. to the best as per its ability.
Creating custom tag XML language easy to creating your defined tag to describing data.
Exchanging data XML data you can sharing easily between different application as well as
database.
XML structure start to a Parent root from top of the side. Every XML documents have
only one root element.
XML was describe a tree structures of data. And tree structure have one root, child elements,
branches, attributes, values. Following are simple XML structures.
Above visual tree structure assume our example base on this structure make one XML
document including all that describe information.
<employee>
<emp_info id="1">
<name>
<first_name>Opal</first_name>
<middle_name>Venue</middle_name>
<last_name>Kole</last_name>
</name>
<contact_info>
3
<company_info>
<comp_name>Odoo (formally OpenERP)</comp_name>
<comp_location>
<street>Tower-1, Infocity</street>
<city>GH</city>
<phone>000-478-1414</phone>
</comp_location>
<designation>Junior Engineer</designation>
</company_info>
<phone>000-987-4745</phone>
<email>[email protected]</email>
</contact_info>
</emp_info>
</employee>
Entity Support
In XML special character have some special meaning similar to a HTML.
If you are use < sign or > sign inside XML element, It'll generate a error because document
parse interprets assume it's start new element.
For avoiding this error use entity character instead of some special character (<, >).
4
XML specification defines five predefined entities represent special characters. Following
table represent five XML predefined entities lists.
If element contents absence(empty) then you can write element following two way to
represent valid standard.
<element />
<element></element>
XML Attributes
XML element can have attributes for identify elements.
<emp_info id="1"> <!-- Attributes represent-->
<name>
<first_name>Opal</first_name>
<last_name>Kole</last_name>
</name>
<emp_info>
XML standard specifies element may have define multiple attributes along with unique
attribute name.
5
Namespaces in XML
Namespaces in XML primary purpose to distinguish between duplicate
elements and attribute names.
Same tag name may have different meaning in different application. So it’s
creating confusion on exchanging documents.
Prefixes are bind to a namespace URI using xmlns:prefix attribute to the prefixed element.
<r:student xmlns:r="http://www.w3c.org/xml/">
</r:student>
Example:
<student>
<result>
<name>Raju</name>
<cgpa>8.4</cgpa>
</result>
<cv>
<name>Raju</name>
<cgpa>8.4</cgpa>
</cv>
</student>
Above XML document both <result> and <cv> have the same <cgpa> element, so XML parser
doesn't know which one is parse.
6
That's why XML namespaces is use for mapping between an element prefix and a URI.
XML namespace URI not a point to a information about the namespace but they are identify
unique elements.
We are specify prefix name as per different element. xmlns attribute with XML namespaces
as follows
<s:student xmlns:s="http://www.w3c.org/some_url1"\
xmlns:res="http://www.w3c.org/some_url2">
<r:result>
<r:name>Raju</r:name>
<r:cgpa>8.4</r:cgpa>
</r:result>
<res:cv>
<res:name>Raju</res:name>
<res:cgpa>8.4</res:cgpa>
</res:cv>
</s:student>
DTD Introduction
DTD (Document Type Definition) is a type of document schema and define the
structure of XML documents.
DTD provide a framework for validating XML documents. You can create DTD file
that are shareable to a different application.
In XML you can define tags without defining what tag are legal. But defined XML
document structure must be conform to, if you specifies DTD rules.
DTD does not identify root element. Manually you want to inform (write) root
element.
In short DTD contains number of rules that rules must be follow XML document.
Specifies the tags and attributes that can be used to creating XML document.
Well-formed XML:
A Well-formated file is follow general XML rules like every open tag must be closed,
tags must be properly nested, empty tag must be end with '/>', attribute values must be
enclosed either single or double quotes etc.
Valid XML
Valid XML file is conforms to a specific structure and that XML file have DTD that
specifies used tags, attributes those are tag contains.
DTD declaration
DTD declarations section we can define different elements, attributes, entity, notation
rules. Well-structured XML (include DTD) document follow the DTD specified rules.
Element declaration in DTD Specifies the name of the tag that use to build XML document.
Every (General) XML element declare by following way,
<!ELEMENT element_name (inside_element)>
element_name specifies the general identifier and inside_element specifies what are content
inside the element.
Elements declared with the ANY keyword, Any keyword contain any combination of parse-
able data.
Empty Element
<!ELEMENT element_name (EMPTY)> <!-- Syntax-->
EMPTY keyword specifies the empty tag. Inside no any element content.
#PCDATA (parsed character data) keyword specifies parsed only character content.
8
DTD attribute declaration: If an element have attributes, you have to declare the name of the
attributes in DTD.
<!ATTLIST element_name attr_name attr_token_type attr_declaration>
Description
element_name specifies the element name.
attr_name specifies the element attribute name.
attr_token_type specifies the structure/character string value.
attr_declaration specifies the default behavior of this attributes.
Attribute declaration specifies the default behavior of the attribute.
#REQUIRED attribute must have value.
Syntax:
<!ATTLIST element_name attribute_name CDATA #REQUIRED>
EX:
<!ATTLIST employee id CDATA #REQUIRED>
#IMPLIED attribute value are optional. Not compulsory to have some value.
default_value attributes default value specifies.
<!ATTLIST email domain CDATA "personal">
#FIXED defaut_value attribute must have in element and also specifies the default
value.
3. ENTITY Declaration
XML give you control to make your own ENTITY. You can define/declare the entity in
Document DTD (Document Type Definition) section. Once you create ENTITY, you can
ready to use that entity in your XML document.
<!ENTITY ph "000-478-1414">
4. Notation declaration
External DTD
External DTD are shared between multiple XML documents. Any changes are update in
DTD document effect or updated come to a all XML documents.
Example:
Save file name : external_dtd.dtd
…………………….
10
2. Stands For
DTD stands for Document Type Definition. XSD stands for XML Schema Definition.
5. Simplicity
DTD is harder than XSD. XSD is simple than DTD.
Schemas Fundamentals
An XML schema describes the structure of an XML instance document by defining
what each element must or may contain.
Schema authors can define their own types or use the built-in types.
The following is a high-level overview of Schema types.
11
The document element of XML schemas is xs:schema. It takes the attribute xmlns:xs with
the value of http://www.w3.org/2001/XMLSchema , indicating that the document should
follow the rules of XML Schema.
Validating an XML Instance Document
The code sample below shows a valid XML instance of above XML schema.
/xmlinstance.xml
<?xml version="1.0"?>
<Author xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespace SchemaLocation="Author.xsd">
<FirstName>Mark</FirstName>
<LastName>Twain</LastName>
</Author>
The xmlns:xsi attribute of the document element indicates that this XML document is an
instance of an XML schema. The document is tied to a specific XML schema with
the xsi:noNamespaceSchemaLocation attribute.
simpleType
The simpleType allows you to have text-based elements. It contains less attributes, child
elements, and cannot be left empty.
complexType
The complexType allows you to hold multiple attributes and elements. It can contain
additional sub elements and can be left empty.
password.xsd
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:simpleType name="Password">
<xs:restriction base="xs:string">
<xs:minLength value="6"/>
13
<xs:maxLength value="12"/>
</xs:restriction>
</xs:simpleType>
<xs:element name="User">
<xs:complexType>
<xs:sequence>
<xs:element name="PW" type="Password"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Password.xml
<?xml version="1.0"?>
<User xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="Password.xsd">
<PW>MyPass</PW>
</User>
<colleges>
<college>
<name>SJMP</name>
<url>http://www.sjmpbirur.in</url>
</college>
</colleges>
college.css
colleges {
margin:10px;
background-color:#ccff00;
font-family:verdana,helvetica,sans-serif;
}
name {
display:block;
font-weight:bold;
}
url {
display:block;
color:#636363;
font-size:small;
font-style:italic;
}