Unit-2 XML
Unit-2 XML
INTRODUCTION TO XML
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 1
INTRODUCTION TO XML
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 2
INTRODUCTION TO XML
DECLARING ELEMENTS
• DTD follows rules of context-free grammar for element declaration
• A DTD describes the syntactic structure of a particular set of documents
• Each element declaration in a DTD specifies the structure of one category of elements
• An element is a node in such a tree either a leaf node or an internal node
• If element is leaf node, its syntactic description is its character pattern
• If the element is internal node, its syntactic description is a list of its child element
• The form of an element declaration for elements that contain elements is as follows:
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 3
INTRODUCTION TO XML
• In many cases, it is necessary to specify the number of times that a child element may appear.
This can be done in a DTD declaration by adding a modifier to the child element specification.
These modifiers, described in Table 7.1, are borrowed from regular expressions.
• Any child element specification can be followed by one of the modifiers.
MODIFIER MEANING
• In this example, a person element is specified to have the following child elements: one or more
parent elements, one age element, possibly a spouse element, and zero or more sibling
elements.
• The leaf nodes of a DTD specify the data types of the content of their parent nodes, which are
elements.
• In most cases, the content of an element is type PCDATA, for parsable character data. Parsable
character data is a string of any printable characters except “less than” (<), “greater than” (>),
and the ampersand (&).
• Two other content types can be specified: EMPTY and ANY.
• The EMPTY type specifies that the element has no content; it is used for elements similar to the
XHTML img element.
• The ANY type is used when the element may contain literally any content.
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 4
INTRODUCTION TO XML
DECLARING ATTRIBUTES
The attributes of an element are declared separately from the element declaration in a DTD. An
attribute declaration must include the name of the element to which the attribute belongs, the
attribute’s name, its type, and a default option. The general form of an attribute declaration is
as follows:
If more than one attribute is declared for a given element, the declarations can be combined, as
in the following element
• The default option in an attribute declaration can specify either an actual value or a requirement
for the value of the attribute in the XML document.
For example, suppose the DTD included the following attribute specifications:
Then the following XML element would be valid for this DTD:
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 5
INTRODUCTION TO XML
DECLARING ENTITIES :
Entities can be defined so that they can be referenced anywhere in the content of an XML
document, in which case they are called general entities. The predefined entities are all general
entities.
Entities can also be defined so that they can be referenced only in DTDs, in which case they are
called parameter entities.
The form of an entity declaration is
When the optional percent sign (%) is present in an entity declaration, it specifies that the entity
is a parameter entity rather than a general entity.
Example: <!ENTITY sbs “Santhosh B Suresh”>
When an entity is longer than a few words, its text is defined outside the DTD. In such cases, the
entity is called an external text entity. The form of the declaration of an external text entity is
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 6
INTRODUCTION TO XML
• Some XML parsers check documents that have DTDs in order to ensure that the documents
conform to the structure specified in the DTDs. These parsers are called validating parsers.
• If an XML document specifies a DTD and is parsed by a validating XML parser, and the parser
determines that the document conforms to the DTD, the document is called valid.
• Handwritten XML documents often are not well formed, which means that they do not follow
XML’s syntactic rules.
• Any errors they contain are detected by all XML parsers, which must report them.
• XML parsers are not allowed to either repair or ignore errors.
• Validating XML parsers detect and report all inconsistencies in documents relative to their DTDs.
External DTD Example: [assuming that the DTD is stored in the file named planes.dtd]
EXAMPLE: sampleDTD.xml
NAMESPACES :
• One problem with using different markup vocabularies in the same document is that collisions
between names that are defined in two or more of those tag sets could result.
• An example of this situation is having a <table> tag for a category of furniture and a <table> tag
from XHTML for information tables.
• Clearly, software systems that process XML documents must be capable of unambiguously
recognizing the element names in those documents.
• To deal with this problem, the W3C has developed a standard for XML namespaces (at
http://www.w3.org/TR/REC-xml-names).
• An XML namespace is a collection of element and attribute names used in XML documents. The
name of a namespace usually has the form of a uniform resource identifier (URI).
• A namespace for the elements and attributes of the hierarchy rooted at a particular element is
declared as the value of the attribute xmlns.
• The form of a namespace declaration for an element is
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 7
INTRODUCTION TO XML
• If the prefix is not included, the namespace is the default for the document.
• A prefix is used for two reasons. First, most URIs are too long to be typed on every occurrence of
every name from the namespace. Second, a URI includes characters that are invalid in XML.
• Note that the element for which a namespace is declared is usually the root of a document.
• For ex: all XHTML documents in this notes declare the xmlns namespace on the root element,
html:
XML SCHEMAS
XML schemas is similar to DTD i.e. schemas are used to define the structure of the document DTDs had
several disadvantages:
• The syntax of the DTD was un-related to XML, therefore they cannot be analysed with an XML
processor
• It was very difficult for the programmers to deal with 2 different types of syntaxes
• DTDs does not support the data type of content of the tag. All of them are specified as text
• Hence, schemas were introduced
SCHEMA FUNDAMENTALS :
• Schemas can be considered as a class in object oriented programming
• A XML document that conforms to the standard or to the structure of the schema is similar to
an object
• The XML schemas have 2 primary purposes.
• They are used to specify the structure of its instance of XML document, including which
elements and attributes may appear in instance document. It also specifies where and
how often the elements may appear
• The schema specifies the datatype of every element and attributes of XML
• The XML schemas are namespace-centric
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 8
INTRODUCTION TO XML
DEFINING A SCHEMA
Schemas themselves are written with the use of a collection of tags, or a vocabulary, from a
namespace that is, in effect, a schema of schemas. The name of this namespace is:
http://www.w3.org/2001/XMLSchema
• Every schema has schema as its root element. This namespace specification appears as follows:
xmlns:xsd = “http://www.w3.org/2001/XMLSchema”
• The name of the namespace defined by a schema must be specified with the targetNamespace
attribute of the schema element.
targetNamespace = “http://cs.uccs.edu/planeSchema”
• If the elements and attributes that are not defined directly in the schema element are to be
included in the target namespace, schema’s elementFormDefault must be set to qualified, as
follows:
elementFormDefault = “qualified”
• The default namespace, which is the source of the unprefixed names in the schema, is given
with another xmlns specification, but this time without the prefix:
xmlns = “http://cs.uccs.edu/planeSchema”
The above is an alternative to the preceding opening tag would be to make the XMLSchema names the
default so that they do not need to be prefixed in the schema. Then the names in the target namespace
would need to be prefixed.
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 9
INTRODUCTION TO XML
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 10
INTRODUCTION TO XML
• Constant values are given with the fixed attribute, as in the following example
• A simple user-defined data type is described in a simpleType element with the use of facets.
• Facets must be specified in the content of a restriction element, which gives the base type
name.
• The facets themselves are given in elements named for the facets: the value attribute specifies
the value of the facet.
COMPLEX TYPES
Complex types are defined with the complexType tag. The elements that are the content of an
element-only element must be contained in an ordered group, an unordered group, a choice, or a
named group. The sequence element is used to contain an ordered group of elements. Example:
• A complex type whose elements are an unordered group is defined in an all element. Elements
in all and sequence groups can include the minOccurs and maxOccurs attributes to specify the
numbers of occurrences.
• Example: <?xml version = “1.0” encoding = “utf-8”?>
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 11
INTRODUCTION TO XML
• With the year element defined globally, the sports_car element can be defined with a reference
to the year with the ref attribute:
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 12
INTRODUCTION TO XML
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 13
INTRODUCTION TO XML
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 14
INTRODUCTION TO XML
An XSLT style sheet is an XML document whose root element is the special-purpose element
stylesheet. The stylesheet tag defines namespaces as its attributes and encloses the collection
of elements that defines its transformations. It also identifies the document as an XSLT
document.
<xsl:stylesheet version="1.0" xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”>
In many XSLT documents, a template is included to match the root node of the XML document.
<xsl:template match="/">
In many cases, the content of an element of the XML document is to be copied to the output
document. This is done with the value-of element, which uses a select attribute to specify the
element of the XML document whose contents are to be copied.
<xsl:value-of select="name"/>
The select attribute can specify any node of the XML document. This is an advantage of XSLT
formatting over CSS, in which the order of data as stored is the only possible order of display.
XML PROCESSORS
The XML processor takes the XML document and DTD and processes the information so that it
may then be used by applications requesting the information.
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 15
INTRODUCTION TO XML
The processor is a software module that reads the XML document to find out the structure and
content of the XML document.
The structure and content can be derived by the processor because XML documents contain
self-explanatory data
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 16
INTRODUCTION TO XML
• Finally, because the parser sees the whole document before any processing takes place, this
approach avoids any processing of a document that is later found to be invalid.
WEB SERVICES
• A Web service is a method that resides and is executed on a Web server, but that can be
called from any computer on the Web. The standard technologies to support Web services are
WSDL, UDDI, SOAP, and XML.
• WSDL - It is used to describe the specific operations provided by the Web service, as well as
the protocols for the messages the Web service can send and receive.
• UDDI - also provides ways to query a Web services registry to determine what specific services
are available.
• SOAP - was originally an acronym for Standard Object Access Protocol, designed to describe
data objects.
• XML - provides a standard way for a group of users to define the structure of their data
documents, using a subject-specific mark-up language.
S.B.S. ARTS, COMMERCE AND SCIENCE COLLEGE FOR WOMEN BCA PROGRAMME Page 17