IPT Chapter 3
IPT Chapter 3
Chapter 3
Data Mapping and Exchange: Meta data; Data representation and encoding; XML, DTD,
XML schema s
3.1 Data Mapping and Exchange
Data Mapping:
In computing and data management, data mapping is the process of
creating data element mappings between two distinct data models
In metadata, the term data element is an atomic unit of data that has precise
meaning or precise semantics. A data element has:
1. An identification such as a data element name
2. A clear data element definition
3. One or more representation terms
4. Optional enumerated values Code (metadata)
5. A list of synonyms to data elements
A data model organizes data elements and standardizes how the data elements
relate to one another. Since data elements document real life people, places and things
and the events between them, the data model represents reality.
Why Do We Need Data Mapping?
Data mapping is used as a first step for a wide variety of data integration tasks
including:
• Data transformation or data mediation between data source and destination
• Identification of data relationships
• Discovery of hidden sensitive data
• Consolidation of multiple databases into a single data base and identifying
redundant columns of data for consolidation or elimination
Data integration involves combining data residing in different sources and
providing users with a unified view of these data. This process becomes significant in a
variety of situations, which include both commercial (when two similar companies need
to merge their databases) and scientific (combining research results from
different bioinformatics repositories) domains.
Data Exchange:
Data exchange is the process of taking data structured under
a source schema and actually transforming it into data structured under
a target schema, so that the target data is an accurate representation of the source
data.
1 ||: By Beya
Integrative programming and technologies
3.2 Metadata
Metadata (metacontent) is defined as the data providing information about one
or more aspects of the data, such as:
• Means of creation of the data
• Purpose of the data
• Time and date of creation
• Creator or author of the data
• Location on a computer network where the data were created
Example
Digital image may include metadata that describe the picture size, the color depth, the
image resolution, time and date of image creation.
2 ||: By Beya
Integrative programming and technologies
A text document's metadata may contain information about how long the document is,
who the author is, when the document was written, and a short summary of the
document.
3.3 Introduction to XML
• XML stands for Extensible Markup Language
• XML is a markup language much like HTML
• XML was designed to describe data, not to display data
• XML tags are not predefined. You must define your own tags
• XML is designed to be self-descriptive
• XML is a W3C Recommendation
• XML does not DO anything
Difference between XML and HTML
• XML is not a replacement for HTML; XML is a complement to HTML.
• XML is a software- and hardware-independent tool for carrying information.
• XML was designed to describe data, with focus on what data is
• HTML was designed to display data, with focus on how data looks
XML Does Not DO Anything:
The following example is a note to Tove, from Jani, stored as XML:
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don'tforget me this weekend!</body>
</note>
The note above is quite self descriptive. It has sender and receiver information, it also
has a heading and a message body.
But still, this XML document does not DO anything. It is just information wrapped in
tags. Someone must write a piece of software to send, receive or display it.
How Can XML be used?
XML is used in many aspects of web development, often to simplify data storage and
sharing.
1. XML Separates Data from HTML
2. XML Simplifies Data Sharing
3. XML Simplifies Data Transport
4. XML Simplifies Platform Changes
3 ||: By Beya
Integrative programming and technologies
XML element
• An XML document contains XML Elements.
• An XML element is everything from (including) the element's start tag to
(including) the element's end tag.
• An element can contain:
o other elements
o text
o attributes
o or a mix of all of the above...
Empty XML Elements
An alternative syntax can be used for XML elements with no content: Instead of writing
a book element (with no content) like this:
<book></book>
It can be written like this:
<book />
This sort of element syntax is called self-closing.
4 ||: By Beya
Integrative programming and technologies
Example:
5 ||: By Beya
Integrative programming and technologies
6 ||: By Beya
Integrative programming and technologies
7 ||: By Beya
Integrative programming and technologies
Note: Only the characters "<" and "&" are strictly illegal in XML. The greater than
character is legal, but it is a good habit to replace it.
7. Comments in XML
• The syntax for writing comments in XML is similar to that of HTML.
• <!-- This is a comment -->
8. White-space is preserved in XML
• HTML truncates multiple white-space characters to one single white-space:
• HTML: • Hello
Tove
8 ||: By Beya
Integrative programming and technologies
Like all XML documents, this one starts with an XML declaration, <?xml
version="1.0" encoding="UTF-8"?>. This XML declaration indicates that we're using
XML version 1.0, and using the UTF-8 character encoding,
This XML declaration, <?xml?>, uses two attributes, version and encoding, to
set the version of XML and the character set we're using. Next we create a new XML
element named <document>. XML tags themselves always start with < and end with
>.Then we store other elements in our <document> element, or text data, as we
wish.
Character Encodings: ASCII, Unicode, and UCS
The characters in an XML document are stored using numeric codes. That can
be an issue, because different character sets use different codes, which means an XML
processor might have problems trying to read an XML document that uses a character
set called a character encoding
Which character sets are supported in XML? ASCII? Unicode? UCS?
There are many character encodings that an XML processor can support, such as the
following:
• US-ASCII— U.S. ASCII
• UTF-8— Compressed Unicode
• UTF-16— Compressed UCS
• ISO-10646-UCS-2— Unicode
• ISO-10646-UCS-4— UCS
• ISO-2022-JP— Japanese
• ISO-2022-CN— Chinese
• ISO-8859-5— ASCII and Cyrillic
9 ||: By Beya
Integrative programming and technologies
1. Cascading Style Sheets (CSS), which you can also use with HTML
documents
2. Extensible Style sheet Language style sheets (XSL), designed to be used
only with XML documents
Example 3: (example3.xml)
10 ||: By Beya
Integrative programming and technologies
<heading>
Hello From XML
</heading>
<message>
This is an XML document!
</message>
</document>
<SCRIPT LANGUAGE="JavaScript">
function getData()
{
xmldoc= document.all("firstXML").XMLDocument;
nodeDoc = xmldoc.documentElement;
nodeHeading = nodeDoc.firstChild;
<BODY>
<CENTER>
<H1>
Retrieving data from an XML document
11 ||: By Beya
Integrative programming and technologies
</H1>
<DIV ID="message"></DIV>
<P>
<INPUT TYPE="BUTTON" VALUE="Read the heading"
ONCLICK="getData()">
</CENTER>
</BODY>
</HTML>
12 ||: By Beya
Integrative programming and technologies
As an example, you can see how you add a DTD to our XML document.
DTDs can be separate documents, or they can be built into an XML document as
we've done here using a special element named <!DOCTYPE>.
An XML Document with a DTD (example4.xml)
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="css1.css"?>
<!DOCTYPE document
[
<!ELEMENT document (heading, message)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT message (#PCDATA)>
]>
<document>
<heading>
Hello From XML
</heading>
<message>
This is an XML document!
</message>
</document>
13 ||: By Beya
Integrative programming and technologies
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
14 ||: By Beya
Integrative programming and technologies
• The XML Schema language is also referred to as XML Schema Definition (XSD),
describes the structure of an XML document.
• Defines the legal building blocks (elements and attributes) of an XML document like
DTD.
• defines which elements are child elements
• defines the number and order of child elements
• defines whether an element is empty or can include text
• defines data types for elements and attributes
• defines default and fixed values for elements and attributes
15 ||: By Beya
Integrative programming and technologies
Example
Here are some XML elements:
<lastname>Refsnes</lastname>
<age>36</age>
<dateborn>1970-03-27</dateborn>
And here are the corresponding simple element definitions:
<xs:element name="lastname" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="dateborn" type="xs:date"/>
16 ||: By Beya
Integrative programming and technologies
17 ||: By Beya
Integrative programming and technologies
A complex XML element, "description", which contains both elements and text:
<description>
It happened on <date lang="norwegian">03.03.99</date>
</description>
XSD Elements Only
How to Define a Complex Element using XML Scheme
Look at this complex XML element, "employee", which contains only other elements:
<employee>
<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>
The "employee" element can be declared directly by naming the element, like this:
<xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
If you use the method described above, only the "employee" element can use
the specified complex type. Note that the child elements, "firstname" and "lastname",
are surrounded by the <sequence> indicator. This means that the child elements must
appear in the same order as they are declared. The "employee" element can have a
type attribute that refers to the name of the complex type to use:
XSD Empty Elements
An empty complex element cannot have contents, only attributes.
An empty XML element:
<product prodid="1345" />
It is possible to declare the "product" element more compactly, like this:
<xs:element name="product">
<xs:complexType>
<xs:attribute name="prodid" type="xs:positiveInteger"/>
</xs:complexType>
</xs:element>
18 ||: By Beya
Integrative programming and technologies
XSD Indicators
We can control HOW elements are to be used in documents with indicators.
Order Indicators
Order indicators are used to define the order of the elements.
Order indicators are:
• All
• Choice
• Sequence
All Indicator
The <all> indicator specifies that the child elements can appear in any order, and that
each child element must occur only once:
<xs:element name="person">
<xs:complexType>
<xs:all>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:element>
Choice Indicator
The <choice> indicator specifies that either one child element or another can occur:
<xs:element name="person">
<xs:complexType>
<xs:choice>
<xs:element name="employee" type="employee"/>
<xs:element name="member" type="member"/>
</xs:choice>
</xs:complexType>
</xs:element>
Sequence Indicator
The <sequence> indicator specifies that the child elements must appear in a specific
order:
19 ||: By Beya
Integrative programming and technologies
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
An XML Document
Let's have a look at this XML document called "shiporder.xml":
<?xml version="1.0" encoding="ISO-8859-1"?>
<shiporder orderid="889923">
<orderperson>John Smith</orderperson>
<shipto>
<name>Ola Nordmann</name>
<address>Langgt 23</address>
<city>4000 Stavanger</city>
<country>Norway</country>
</shipto>
</shiporder>
The XML document above consists of a root element, "shiporder", that contains
a required attribute called "orderid". The "shiporder" element contains child elements:
"orderperson" and “shipto”.
Create an XML Schema
Now we want to create a schema for the XML document above. We start by
opening a new file that we will call "shiporder.xsd". To create the schema we could
simply follow the structure in the XML document and define each element as we find
it. We will start with the standard XML declaration followed by the xs:schema element
that defines a schema:
<?xml version="1.0" encoding="UFT-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
...
</xs:schema>
20 ||: By Beya
Integrative programming and technologies
In the schema above we use the standard namespace (xs), and the URI
associated with this namespace is the Schema language definition, which has the
standard value of http://www.w3.org/2001/XMLSchema.
Next, we have to define the "shiporder" element. This element has an attribute
and it contains other elements, therefore we consider it as a complex type. The child
elements of the "shiporder" element is surrounded by a xs:sequence element that
defines an ordered sequence of sub elements:
<xs:element name="shiporder">
<xs:complexType>
<xs:sequence>
...
</xs:sequence>
</xs:complexType>
</xs:element>
Then we have to define the "orderperson" element as a simple type (because it
does not contain any attributes or other elements). The type (xs:string) is prefixed
with the namespace. The prefix associated with XML Schema that indicates a predefined
schema data type:
<xs:element name="orderperson" type="xs:string"/>
Next, we have to define two elements that are of the complex type: "shipto". We start
by defining the "shipto" element:
<xs:element name="shipto">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
We can now declare the attribute of the "shiporder" element. Since this is a required
attribute we specify use="required".
Note: The attribute declarations must always come last:
<xs:attribute name="orderid" type="xs:string" use="required"/>
21 ||: By Beya
Integrative programming and technologies
<xs:element name="shiporder">
<xs:complexType>
<xs:sequence>
<xs:element name="shipto">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="orderid" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
An XSD Example
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
22 ||: By Beya
Integrative programming and technologies
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The Schema above is interpreted like this:
• <xs:element name="note"> defines the element called "note"
• <xs:complexType> the "note" element is a complex type
• <xs:sequence> the complex type is a sequence of elements
• <xs:element name="to" type="xs:string"> the element "to" is of type string (text)
• <xs:element name="from" type="xs:string"> the element "from" is of type string
• <xs:element name="heading" type="xs:string"> the element "heading" is of type
string
• <xs:element name="body" type="xs:string"> the element "body" is of type string
23 ||: By Beya
Integrative programming and technologies
Example
<html>
<body>
<span id="to"></span>
<span id="from"></span>
<span id="message"></span>
<script>
if (window.XMLHttpRequest)
{// code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp=new XMLHttpRequest();
}
else
{// code for IE6, IE5
24 ||: By Beya
Integrative programming and technologies
xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xmlhttp.open("GET","note.xml",false);
xmlhttp.send();
xmlDoc=xmlhttp.responseXML;
document.getElementById("to").innerHTML=
xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue;
document.getElementById("from").innerHTML=
xmlDoc.getElementsByTagName("from")[0].childNodes[0].nodeValue;
document.getElementById("message").innerHTML=
xmlDoc.getElementsByTagName("message")[0].childNodes[0].nodeValue;
</script>
</body>
</html>
Important Note!
To extract the text "Tove" from the <to> element in the XML file above
("note.xml"), the syntax is: getElementsByTagName("to")[0].childNodes[0].nodeValue
Notice that even if the XML file contains only ONE <to> element you still have to
specify the array index [0]. This is because the getElementsByTagName() method
returns an array.
25 ||: By Beya