0% found this document useful (0 votes)
106 views

Introduction To XML Extensible Markup Language: Prof.N.Nalini AP (SR) VIT

1. XML allows for semi-structured data by allowing users to define their own elements and tags to describe data structures. 2. XML documents must be well-formed with a single root element and properly nested tags. They can also be validated against a DTD or schema. 3. XML provides advantages over structured and unstructured data by allowing flexible yet portable data representation and extensibility through user-defined elements and tags.

Uploaded by

justadityabist
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views

Introduction To XML Extensible Markup Language: Prof.N.Nalini AP (SR) VIT

1. XML allows for semi-structured data by allowing users to define their own elements and tags to describe data structures. 2. XML documents must be well-formed with a single root element and properly nested tags. They can also be validated against a DTD or schema. 3. XML provides advantages over structured and unstructured data by allowing flexible yet portable data representation and extensibility through user-defined elements and tags.

Uploaded by

justadityabist
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 35

Introduction to XML

Extensible Markup Language


Prof.N.Nalini
AP(Sr)
VIT
Structured vs. unstructured data
• Relational databases are highly structured
• All data resides in tables
• You must define schema before entering any data
• Every row confirms to the table schema
• Changing the schema is hard and may break many things
• Texts are highly unstructured
• Data is free-form
• There is pre-defined schema, and it’s hard to define one
• Readers need to infer structures and meanings
Semi-structured data
• Observation: most data have some structure, e.g.:
• Book: chapters, sections, titles, paragraphs, references,
index, etc.
• Item for sale: name, picture, price (range), ratings,
promotions, etc.
• Web page: HTML
• Ideas:
• Ensure data is “well-formatted”
• If needed, ensure data is also “well-structured”
• But make it easy to define and extend this structure
• Make data “self-describing”
HTML: language of the Web
XML

• Text-based
• Capture data (content), not presentation
• Data self-describes its structure
• Names and nesting of tags have meanings!
Features of XML
• Portability: Just like HTML, you can ship XML data
across platforms
• Relational data requires heavy-weight API’s
• Flexibility: You can represent any information
(structured, semi-structured, documents, …)
• Relational data is best suited for structured data
• Extensibility: Since data describes itself, you can
change the schema easily
• Relational schema is rigid and difficult to change
Well-formed XML documents

• XML is case sensitive.


• Attributes should be properly coated within single/double quotes and its
name must not appear more than once in the same start-tag(unique in the
tag)
sample
XML Trees

• An XML document has a single root node.


• The tree is a general ordered tree.
– A parent node may have any number of
children.
– Child nodes are ordered, and may have
siblings.
• Preorder traversals are usually used for
getting information out of the tree.
Advantages of XML

• XML is text (Unicode) based.


– Takes up less space.
– Can be transmitted efficiently.
• One XML document can be displayed differently
in different media.
– Html, video, CD, DVD,
– You only have to change the XML document in order
to change all the rest.
• XML documents can be modularized. Parts can
be reused.
Valid XML Documents
• A well-formed document has a tree structure and
obeys all the XML rules.
• A particular application may add more rules in
either a DTD (document type definition) or in a
schema.
• Many specialized DTDs and schemas have
been created to describe particular areas.
• DTDs were developed first, so they are not as
comprehensive as schema.
– DTD--- Structure
– Schema---Structure and Content
Validation
<Book>
<Title>Illusions The Adventures of a Reluctant Messiah</Title>
<Author>Richard Bach</Author>
<Date>1977</Date>
<ISBN>0-440-34319-4-ppp</ISBN>
<Publisher>Dell Publishing Co.</Publisher>
</Book>

Rules that indicate


the valid structure Validator
of book data

Error!!! Invalid ISBN!


Document Type Definitions
• A DTD describes the tree structure of a document and something about its
data.
• A DTD is optional
• A DTD specifies a grammar for the document
• Constraints on structures and values of elements, attributes,
etc.
• There are two data types, PCDATA and CDATA.
–PCDATA is parsed character data.
–CDATA is character data, not usually parsed.
• A DTD determines how many times a node may appear, and how child
nodes are ordered.
• Child elements can have modifiers, +, *, ?
<!ELEMENT person
(ID+, age, Lastname?, sibling*)>
DTD for address Example
<!ELEMENT address (name, email, phone, birthday)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT birthday (year, month, day)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT month (#PCDATA)>
<!ELEMENT day (#PCDATA)>
• note.dtd file
Schemas
• Schemas are themselves XML documents.
• They were standardized after DTDs and provide more
information about the document.
• They have a number of data types including string,
decimal, integer, boolean, date, and time.
• Data Type Categories
1. Simple (strings only, no attributes and no
nested elements)
2. Complex (can have attributes and nested
elements)
• They also determine the tree structure and how many
children a node may have.
XML Schema definition (XSD)
Example Schema
EXAMPLE:1
XML with xsd
EXAMPLE:1
Schema file
EXAMPLE: 2
XML with xsd
EXAMPLE: 2
xsd file
Restrictions
Transformations: XSL
• Language for expressing document styles
• Specifies the presentation of XML
– More powerful than CSS
• Consists of:
– XSLT
– XPath
– XSL Formatting Objects (XSL-FO)
Transforming the Data

XML

Transformation
Transformation
Instructions
Tool

HTML, XML, Text

XSLT – a language used to transform XML


data into a different form (commonly XML
or HTML)
XSLT
Extensible Stylesheet Language Transformations

• XSLT is used to transform one xml document


into another, often an html document.
• A program is used that takes as input one xml
document and produces as output another.
• If the resulting document is in html, it can be
viewed by a web browser.
• This is a good way to display xml data.
A Style Sheet to Transform address.xml

<?xml version="1.0" encoding="ISO-8859-1"?>


<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="address">
<html><head><title>Address Book</title></head>
<body>
<xsl:value-of select="name"/>
<br/><xsl:value-of select="email"/>
<br/><xsl:value-of select="phone"/>
<br/><xsl:value-of select="birthday"/>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
The Result of the Transformation

Alice Lee
[email protected]
123-45-6789
1983-7-15
Parsers

• There are two principal models for


parsers.
• SAX – Simple API for XML
– Uses a call-back method
– Similar to javax listeners
• DOM – Document Object Model
– Creates a parse tree
– Requires a tree traversal
References

• Elliotte Rusty Harold, Processing XML with


Java, Addison Wesley, 2002.
• Elliotte Rusty Harold and Scott Means,
XML Programming, O’Reilly & Associates,
Inc., 2002.
• W3Schools Online Web Tutorials,
http://www.w3schools.com.

You might also like